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I. Basis fth report 
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□ the claims, Nos,: 
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V. 



Reasoned stat m nt und r Article 35(2) with regard to novelty, inventive st p or industrial 
applicability; citations and explanations supporting such statement 



1. 



Statement 



Novelty (N) 



Yes: 
No: 



Claims 
Claims 



2-15, 17-20, 25-28,43-45 
1,16, 21-24, 29-42, 46 



Inventive step (IS) 



Yes: 
No: 



Claims 
Claims 



9, 17-20, 25-28, 43-45 
1-8, 10-16, 21-24, 29-42, 46 



Industrial applicability (IA) 



Yes: 
No: 



Claims 
Claims 



1-46 



2. Citations and explanations 
see separate sheet 

VI. Certain documents cited 

1 . Certain published documents (Rule 70.10) 
and / or 

2. Non-written disclosures (Rule 70.9) 
see separate sheet 

VII. Certain defects in the international application 

The following defects in the form or contents of the international application have been noted: 
see separate sheet 

VIII. Certain observations on the international application 

The following observations on the clarity of the claims, description, and drawings or on the question whether the 
claims are fully supported by the description, are made: 

see separate sheet 
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V. REASONED STATEMENT UNDER ARTICLE 25 

1 ) This international preliminary examination report has been established 
considering the priority date 14.05.96 as a valid date. The Applicant is reminded 
that documents 

WO 97/1 1 167 published on 27.03,97, 
WO 97/10704 published on 27.03.97, 
Development 124, p.2049-2062 (1997) 

cited in the International Search Report may become relevant after consideration 
of the priority document which is unavailable at present. 

2) The present application relates to methods and materials used to generate 
apomictic seeds. 

Apomixis is an asexual method of reproduction in plants whereby the embryo is 
derived from mitotic division of a megaspore mother cell or a somatic cell of the 
ovule. Meiosis and fertilization are not involved in development of the embryo and 
the progeny of apomictic plants are exact replicas of the female plant. The genetic 
loci controlling apomixis are not identified. The Applicant has established an 
embryogenic cell culture upon incubation of seed-derived seedling hypocotyl 
explants of Daucus carota in auxin-containing medium. Differential screening for 
genes expressed in cells that form somatic embryos but not in cells that do not 
form somatic embryos resulted in the identification of a gene transiently expressed 
during the transition between the somatic and embryogenic cell state. The 
identified gene, termed SERK, encodes a receptor-like protein kinase with a 
leucine-rich repeat domain and it is presented in SEQ ID NO 1-3 (genomic, cDNA 
and protein sequence, respectively, of Daucus carota) . SEQ ID NO 20, 32, 33 
(genomic, cDNA, protein sequence, respectively, of Arabidopsis thaliana) . Based 
on the developmental pattern of expression of said gene it may represent a 
significant part of one of the mechanisms controlling apomictic reproduction. 

A method involving transformation of plant material with the identified SERK gene 
and ectopic expression of said gene in the vicinity of the embryo sac is expected 
to produce apomictic seeds. 
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The subject-matter of Claims 9, 17-20, 25-28 and 43-45 relates to the method 
outlined above or the sequence involved. In the light of the cited literature, said 
subject-matter is not anticipated nor obvious to the skilled person. Thus, said 
claims meet the requirements of Article 33.2 and 33.3 PCT for novelty and 
inventive step. 

3) The subject-matter of Claim 1 does not meet the requirements of Article 33.2 PCT 
for novelty. 

Document WO 89/00810 published 9.2.89 (D1) which is considered to represent 
the closest prior art describes a method of production of apomictic seeds whereby 
plant material is transformed with a nucleic acid-containing particle called 
AMS/vector (p. 24-26, 32-35). Said nucleic acid is identified as a 3.5kb DNA 
molecule (p. 40). No further characterization of the nucleic acid is presented in D1. 
However, it is implicit in said document that in order for such a DNA molecule to 
effect production of apomictic seeds, it must encode a protein which induces the 
formation of somatic embryo and since it proceeds to forming a seed said DNA 
must be expressed in the vicinity of the embryonic sac. Therefore, the method of 
Claim 1 has been already disclosed in document D1. 

The subject-matter of Claims 16, 35-42 and 46 relates to DNA encoding a protein 
capable of rendering a cell embryogenic; a vector comprising said DNA; plant cell 
or plants transformed with said vector; a method utilizing pollen of said plants. As 
explained above, said subject-matter has been anticipated in document D1 and 
thus, said claims do not meet the requirements of Article 33.2 PCT for novelty. 

4) Dependent Claims 2-8 and 10-15 include technical features that may contribute 
to the novelty of the subject-matter as claimed in Claim 1 over the prior art 
document D1, however, said claims do not involve an inventive step. It is a 
customary practice in the field of genetic engineering to isolate and characterize 
DNA sequences that exert an obvious cellular effect especially if the advantage of 
using this DNA sequence can easily be contemplated as is the case in document 
D1 where a still uncharacterized DNA molecule is inducing the production of 
apomictic plants and seeds. 
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5) The subject-mater of independent Claims 21-24 does not meet the requirements 
of Article 33.2 PCT for novelty. 

The scope of the claims as filed extends to cover any DNA sequence encoding a 
protein that bears similarity to the specified SEQ ID NO which is also capable of 
being membrane bound and which has kinase activity. 

Document Science 270, p. 1804- 1806 of 15.12.1995 (D2) discloses a protein 
encoded by the rice gene Xa21 which confers resistance to Xanthomonas oryzae 
pv. oryzae race 6. Said protein carries a serine-threonine kinase-like domain and 
is believed to be a cell surface bound protein. Thus, a protein that satisfies the 
criteria set for the proteins of Claims 21-24 is already known in the art. 

6) Similarly, the subject-matter of dependent Claims 29-34 does not meet the 
requirements of Article 33.2 for novelty because the additional technical features 
present in said claims do not overcome the lack of novelty of the claim they 
depend on. 

VI. CERTAIN DOCUMENTS 

7) The following documents are cited under Rule 70.10 PCT 

WO 97/1 1 167 published on 27.03.97, filed on 23.09.96, with priority date 22.09.95 
WO 97/10704 published on 27.03.97, filed on 23.09.96, with priority date 22.09.95 

VII. CERTAIN DEFECTS IN THE INTERNATIONAL APPLICATION 

8) Contrary to the requirements of Rule 5. 1 (a)(ii) PCT, the relevant background art 
disclosed in the document D1 is not mentioned in the description, nor is this 
document identified therein. 

VIII. CERTAIN OBSERVATIONS ON THE INTERNATIONAL APPLICATION 

9) The subject-matter of independent Claim 1 dependent Claims 2-8 and 10-15 
does not meet the requirements of Article 6 and Rule 6 (a) PCT. 
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The technical features of Claim 1 are defined as follows: 

i) transformation of plant material with a nucleotide sequence which induces 
embryogenesis 

ii) regeneration of the plant material into plants 

iii) expressing the sequence in the vicinity of the embryo sac. 

Technical feature (i) is considered to be ambiguous in the sense that said 
nucleotide sequence is defined as capable of rendering a cell embryogenic which 
is merely the result to be achieved by the invention. The claims as filed disclose 
no technical features concerning the nucleotide sequence which is the way to 
arrive at the invention and thus, do not disclose the subject-matter the protection 
is sough for. A possibility to overcome this objection may be the definition of said 
nucleotide sequence by its formula i.e. the primary nucleotide sequence of SEQ 
IDN01 or 2 or 20 or 32. 

10) The subject-matter of Claims 12, 31, and 45 is not clear as required by Rule 6 
PCT. 

The promoter or the protein used in the claimed methods, respectively, are only 
defined by an arbitrary designation, namely "SERK" without disclosing any 
technical feature which unambiguously characterizes the claimed subject-matter. 
A gene and/or a protein being chemical products should be clearly defined by their 
formula i.e. their nucleotide and/or amino acid sequence. 

11) The dependancies of Claims 42 and 46 insofar they concern Claim 40 are wrong 
because said claim is not a method claim. 
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Production of Apomictic Seed 



The present invention relates to the production of genetically transformed plants. In particular 
the invention relates inter alia to a process for inducing apomixis, to the apomictic seeds which 
result from the process, and to the plants and progeny thereof which result from the germination 
of such seeds. 

Apomixis, which is vegetative (non-sexual) reproduction through seeds, is a genetically 
controlled reproductive mechanism found in some polyploid non-cultivated species. The process 
is classified as gametophytic or non-gametophytic. In gametophytic apomixis - of which there 
are two types (apospory and diplospory), multiple embryo sacs which typically lack antipodal 
nuclei are formed, or else megasporogenesis in the embryo sac takes place. In adventitious 
embryony (non-gametophytic apomixis), a somatic embryo develops directly from the cells of the 
embryo sac, ovary wall or integuments. In adventitious embryony, somatic embryos from 
surrounding cells invade the sexual ovary, one of the somatic embryos out-competes the other 
somatic embryos and the sexual embryo and utilizes the produced endosperm. 

Were apomixis to be a controllable and reproducible phenomenon it would provide many 
advantages in plant improvement and cultivar development in the case that sexual plants are 
available as crosses with the apomictic plant. 

For example, apomixis would provide for true-breeding, seed propagated hybrids. Moreover, 
apomixis could shorten and simplify the breeding process so that setting and progeny testing to 
produce and/or stabilize a desirable gene combination could be eliminated. Apomixis would 
provide for the use as cultivars of genotypes with unique gene combinations since apomictic 
genotypes breed true irrespective of heterozygosity. Genes or groups of genes could thus be 
"pyramided and "fixed" in super genotypes. Every superior apomictic genotype from a sexual- 
apomictic cross would have the potential to be a cultivar. Apomixis would allow plant breeders to 
develop cultivars with specific stable traits for such characters as height, seed and forage quality 
and maturity. Breeders would not be limited in their commercial production of hybrids by (i) a 
cytoplasmic-nuclear interaction to produce male sterile female parents or (ii) the fertility restoring 
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capacity of a pollinator. Almost all cross-compatible germplasm could be a potential parent to 
produce apomictic hybrids. 

Finally, apomixis would simplify commercial hybrid seed production. In particular, (i) the need for 
physical isolation of commercial hybrid production fields would be eliminated; (ii) all available 
land could be used to increase hybrid seed instead of dividing space between pollinators and 
male sterile lines; and (iii) the need to maintain parental line seed stocks would be eliminated. 

The potential benefits to accrue from the production of seed via apomixis are presently 
unrealized, to a large extent because of the problem of engineering apomictic capacity into 
plants of interest. The present invention provides a solution to that problem in that it provides 
the means for obtaining plants which exhibit the adventitious embryony type of apomixis. 

According to the present invention there is provided a method of producing apomictic seeds 
comprising the steps of: 

(i) transforming plant material with a nucleotide sequence encoding a protein the 
presence of which in an active form in a cell, or membrane thereof, renders said cell 
embryogenic, 

(ii) regenerating the thus transformed material into plants, or carpel-containing parts 
thereof, and 

(iii) expressing the sequence in the vicinity of the embryo sac. 

By "Vicinity of the embryo sac 1 ' is meant in one or more of the following: carpel, integuments, 
ovule, ovule premordium, ovary wall, chalaza, nuceflus, funicle and placenta. The skilled man 
will recognize that the term "integuments" also includes those tissues, such as endothelium, 
which are derived therefrom. By "embryogenic" is meant the capability of cells to develop into 
an embryo under permissive conditions. It will be appreciated that the term "in an active form" 
includes proteins which are truncated or otherwise mutated with the proviso that they initiate or 
amplify embryogenesis whether or not in doing this they interact with the signal transduction 
components that they otherwise would in the tissues in which they are normally present. 

The term "plant material" includes protoplasts, isolated plant cells (such as stomatal guard cells) 
possessing a cell wall, pollen, whole tissues such as emerged radicle, stem, leaf, petal, 
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hypocotyl section, apical meristem, ovaries, zygotic embryo per se, roots, vascular bundle, 
pericycle, anther filament, somatic embryos and the like. 

A further embodiment of the invention relates to a DNA molecule comprising a nucleotide 
sequence encoding a protein the presence of which in an active form in a cell, or membrane 
thereof, renders said cell embryogenic. 

The said nucleotide sequence may be introduced into the plant material, inter alia, via a bacterial 
or viral vector, by micro-injection, by co-incubation of the plant material and sequence in the 
presence of a high molecular weight glycol or by coating of the sequence onto the surface of a 
biologically inert particle which is then introduced into the material. 

Expression of the sequence may yield a protein kinase capable of spanning a plant cell 
membrane. Typically the kinase may be a leucine rich repeat receptor like kinase which has the 
capacity to auto-phosphorylate. The skilled man will recognize what is meant by the term 
"leucine rich repeat receptor like kinase". Examples of such proteins include Arabidopsis RLK5 
(Walker, 1993), Arabidopsis RPS2 (Bent et ai 1994), Tomato CF-9 gene product (Jones at al. 
1994), Tomato N (Whitham et al. 1994), Petunia PRK1 (Mu et al. 1994), the product of the 
Drosophila Toll gene (Hashimoto et al. 1988), the protein kinase encoded by the rice OsPKW 
gene (Zhao et ai 1994), the translation product of the rice EST clone ric2976 and the product of 
the Drosophila Pelle gene (Shelton and Wasserman, 1993). Still further examples of such 
proteins include the TMK1 , Clavatal, Erecta, and TMKL1 gene products from Arabidopsis, the 
Flightless- 1 gene product from Drosophila, the TrkC gene product from pig, the rat LhCG 
receptor and FSH receptor, the dog TSH receptor, and the human Trk receptor kinase. The 
protein may comprise a ligand binding domain, a proline box, a transmembrane domain, a 
kinase domain and a protein binding domain. In many receptor kinases the extracellular (iigand 
binding) domain serves as an inhibitor of the kinase domain in the ligand-f ree state. This arrest is 
removed after binding of the ligand. Accordingly, in one embodiment of the invention the protein 
either lacks a ligand binding domain or the domain is functionally inactivated so that the kinase 
domain can be constttutiveiy active in the absence of an activating signal (ligand). Whether or 
not the protein possesses a ligand binding domain - functional or otherwise, once expressed and 
incorporated into the plant cell membrane the protein binding domain is preferably located 
intra-cellularty. 
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In a preferred embodiment of the method, the said sequence further ncodes a cell membrane 
targeting sequence. The sequence may be that which is depicted in SEQ ID Nos. 1 , 2, 20, or 
32, or it may be similar in that it is complementary to a sequence which hybridizes under 
stringent conditions with the said sequences and which encodes a membrane bound protein 
having kinase activity. By "similar* is meant a sequence which is complementary to a test 
sequence which is capable of hybridizing to the inventive sequence. When the test and 
inventive sequences are double stranded the nucleic acid constituting the test sequence 
preferably has a TM within 20°C of that of the inventive sequence. In the case that the test and 
inventive sequences are mixed together and denatured simultaneously, the TM values of the 
sequences are preferably within 10°C of each other. More preferably the hybridization is 
performed under stringent conditions, with either the test or inventive DNA preferably being 
supported. Thus either a denatured test or inventive sequence is preferably first bound to a 
support and hybridization is effected for a specified period of time at a temperature of between 
50 and 70°C in double strength citrate buffered saline (SSC) containing 0.1%SDS followed by 
rinsing of the support at the same temperature but with a buffer having a reduced SSC 
concentration. Depending upon the degree of stringency required, and thus the degree of 
similarity of the sequences, at a particular temperature, - such as 60°C, for example - such 
reduced concentration buffers are typically single strength SSC containing 0.1%SDS, half 
strength SSC containing 0.1%SDS and one tenth strength SSC containing 0.1%SDS. 
Sequences having the greatest degree of similarity are those the hybridization of which is least 
affected by washing in buffers of reduced concentration. It is most preferred that the test and 
inventive sequences are so similar that the hybridization between them is substantially 
unaffected by washing or incubation in one tenth strength sodium citrate buffer containing 
0.1%SDS. 

Accordingly, further comprised by the present invention is a DNA sequence as depicted in SEQ 
ID NOS: 22, 24, 26, 28 and 30 or a sequence which is complementary to one which hybridizes 
under stringent conditions with the said sequences and which encodes a membrane bound 
protein having kinase activity. 

The sequence may be modified in that known mRNA instability motifs or polyadenylation signals 
may be removed and/or codons which are preferred by the plant into which the sequence is to 
be inserted may be used so that expression of the thus modified sequence in the said plant may 
yield substantially similar protein to that obtained by expression of the unmodified sequence in 
the organism in which the protein is endogenous. 
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In order to obtain expression of the sequence in the regenerated plant (and in particular the 
carpel thereof) in a tissue specific manner the sequence is preferably under expression control 
of an inducible or developmental^ regulated promoter, typically one of the following: a promoter 
which regulates expression of SERK genes in planta, the Arabidopsis ANT gene promoter, the 
promoter of the 0126 gene from Phalaenopsis, the carrot chitinase DcEP3-1 gene promoter, the 
Arabidopsis AtChitIV gene promoter, the Arabidopsis LTP-1 gene promoter, the Arabidopsis be\- 
1 gene promoter, the petunia fbp-7 gene promoter, the Arabidopsis AtDMCI promoter, the 
pTA7001 inducible promoter. The DcEP3-1 gene is expressed transiently during inner 
integument degradation and later in cells that line the inner part of the developing endosperm. 
The AtChilV gene is transiently expressed in the micropylar endosperm up to cellularisation. The 
LTP-1 promoter is active in the epidermis of the developing nucellus, both integuments, seed 
coat and early embryo. The bel-1 gene is expressed in the developing inner integument and the 
fbb-7 promoter is active during embryo sac development. The Arabidopsis ANT gene is 
expressed during integument development, and the 0126 gene from Phalaenopsis is expressed 
in the mature ovule. 

It is most preferred that the sequence is expressed in the somatic cells of the embryo sac, ovary 
wall, nucellus, or integuments. 

The endosperm within the apomictic seed results from fusion of polar nuclei within the embryo 
sac with a pollen-derived male gamete nucleus. It is preferred that the sequence encoding the 
protein is expressed prior to fusion of the polar nuclei with the male gamete nucleus. 

The invention further includes a DNA, but preferably a recombinant DNA, comprising a 
sequence encoding a protein the presence of which in an active form in a cell, or membrane 
thereof, renders said cell embryogenic. Preferred is a DNA encoding a protein which is a leucine 
rich repeat receptor like kinase and comprises a iigand binding domain, a proline box, a 
transmembrane domain, a kinase domain and a protein binding domain, the Iigand binding 
domain optionally being absent or functionally inactive. 

In particular, the invention embodies a DNA comprising a DNA sequence encoding a N-terminal 
protein fragment having the following amino acid sequence: GhSerTrpAspProlhrLeuValAsnPro 
CysThrTrp PheHs ValThrCys Asn. 
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A specific embodiment of the invention relates to a DNA comprising a DNA sequence encoding 
a protein having the sequence depicted in SEQ ID Nos. 3 or 21 , or a protein substantially similar 
thereto which is capable of being membrane bound and which has kinase activity. By 
substantially similar is meant a pure protein having an amino acid sequence which is at least 
90% similar to the sequence of the proteins depicted in SEQ ID No 3 below. In the context of 
the present invention, two amino acid sequences with at least 90% similarity to each other have 
at least 90% identical or conservatively replaced amino acid residues in a like position when 
aligned optimally allowing for up to 8 gaps with the proviso that in respect of each gap a total not 
more than 4 amino acid residues is affected. For the purpose of the present invention 
conservative replacements may be made between amino acids within the following groups: 

(i) Serine and Threonine; 

(ii) Glutamic acid and Aspartic acid; 

(iii) Arginine and Lysine; 

(iv) Asparagine and Glutamine; 

(v) Isoleucine, Leucine, Valine and Methionine; 

(vi) Phenylalanine, Tyrosine and Tryptophan 

(vii) Alanine and Glycine 

in addition, non-conservative replacements may also occur at a low frequency. Accordingly, the 
invention futher embodies a DNA comprising a DNA sequence encoding a N-tenminal protein 
fragment having the following amino acid sequence: Val Xaa Gh Ser Tip Asp Pro Thr Leu Val Asn Pro 
Thr TrpPhe His Vai Thr Cys Asn, with Xaa ba LeuorVaL 

Especially preferred within the scope of the invention is a DNA comprising a DNA sequence 
encoding a N-terminai protein fragment having the following amino acid sequence: Val Xaa Gin 
Ser Tip Asp Pro Thr Leu Val Asn ProCysThrTrp Phe His ValThrOysAsnXabXacXadXae ValXaf Arg Val 
Asp Leu G!y Asn Xag Xah LeuSer^^LeuXaiProGbLeuGVXajLe^ Xak Xa] Leu Gin, with Xaa to 
Xak representing variable amino acids, but preferably 

Xaa = Leu or Val 

Xab=AsnorGh 

Xac = Glu or Asp or His 

Xad= AsnorKs 

Xae= Ser or Arg or Gin 
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Xaf = BeorThr 

Xag=AiaorSer 

Xah=GluorAsn 

Xai=ValorAla 

Xaj=sValorLys 

Xak=LysorGlu 

Xal=AsnorHis 

It is preferred that the DNA further encodes a cell membrane targeting sequence, and that the 
protein encoding region is under expression control of a developmentally regulated or inducible 
promoter, such as, for example, a promoter which regulates expression of SERK genes in 
plants, the carrot chitinase DcEP3-1 gene promoter, the Arabidopsis AtChitIV gene promoter, 
the Arabidopsis LTP-1 gene promoter, the Arabidopsis beM gene promoter, the petunia fbp-7 
gene promoter, the Arabidopsis ANT gene promoter, or the promoter of the 0126 gene from 
Phalaenopsis; the Arabidopsis AtDMCI promoter, or the pTA7001 inducible promoter. 

Particularly preferred embodiments of the said DNA include those depicted in SEQ ID Nos. 1 f 2, 
20 or 32, or those which are complementary to one which hybridizes under stringent conditions 
with the said sequences and which encode a membrane bound protein having kinase activity. 
As indicated above, the DNA may be modified in that known mRNA instability motifs or 
polyadenylation signals may be removed and/or codons which are preferred by the plant into 
which the DNA is to be inserted may be used so that expression of the thus modified DNA in the 
said plant may yield substantially similar protein to that obtained by expression of the unmodified 
DNA in the organism in which the protein is endogenous. 

The invention still further includes a vector which contains DNA as indicated in the three 
immediately preceding paragraphs, plants transformed with the recombinant DNA or vector, and 
the progeny of such plants which contain the DNA stably incorporated, and/or the apomictic 
seeds of such plants or such progeny. 

The recombinant DNA molecules of the invention can be introduced into the plant cell in 
a number of art-recognized ways. Those skilled in the art will appreciate that the choice 
of method might depend on the type of plant, i.e, monocot or dicot, targeted for 
transformation. Suitable methods of transforming plant cells include microinjection 
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(Crossway et aL, BioTechniques 4:320-334 (1986)), electroporation (Riggs et al, Proc. 
Natl. Acad. ScL USA 83:5602-5606 (1986), Agrobacterium mediated transformation 
(Hinchee et aL, Biotechnology 6:915-921 (1988)), direct gene transfer (Paszkowski et aL, 
EMBO J. 3:2717-2722 (1984)), ballistic particle acceleration using devices available from 
Agracetus, Inc., Madison, Wisconsin and Dupont, Inc., Wilmington, Delaware (see, for 
example, Sanford et a/„ U.S. Patent 4,945,050; and McCabe et aL, Biotechnology 
6:923-926 (1988)), and protoplast transformation/regeneration methods (see U.S. Patent 
No. 5,350,689 issued Sept. 27, 1994 to Ciba-Geigy Corp.). Also see, Weissinger et aL, 
Annual Rev. Genet 22:421-477 (1988); Sanford et a/., Particulate Science and 
Technology 5:27-37 (1987)(onion); Christou et aL, Plant Physiol. 87:671-674 
(1988)(soybean); McCabe et aL, Bio/Technology 6:923-926 (1988)(soybean); Datta et 
aL, Bio/Technology 8:736-740 (1990)(rice); Klein et aL, Proc. Natl. Acad. ScL USA, 
85:4305-4309 (1988)(maize); Klein et aL, Bio/Technology 6:559-563 (1988)(maize); Klein 
et aL, Plant Physiol. 9f:440-444 (1988)(maize); Fromm et aL, Bio/Technology 8:833-839 
(1990); and Gordon-Kamm et aL, Plant Cell 2:603-618 (1990)(maize). 

Comprised within the scope of the present invention are transgenic plants, in particular 
transgenic fertile plants transformed by means of the aforedescribed processes and their 
asexual and/or sexual progeny, which still contain the DNA stably incorporated, and/or the 
apomictic seeds of such plants or such progeny. 

The transgenic plant according to the invention may be a dicotyledonous or a 
monocotyledonous plant. Such plants include field crops, vegetables and fruits including 
tomato, pepper, melon, lettuce, cauliflower, broccoli, cabbage, brussels sprout, sugar beet, 
com, sweetcom, onion, carrot, leek, cucumber, tobacco, alfalfa, aubergine, beet, broad bean, 
celery, chicory, cow pea, endive, gourd, groundnut, papaya, pea, peanut, pineapple, potato, 
saff lower, snap bean, soybean, spinach, squashes, sunflower, sorghum, water-melon, and 
the like; and ornamental crops including Impatiens, Begonia, Petunia, Pelargonium, Viola, 
Cyclamen, Verbena, Vinca, Tagetes, Primula, Saint Paulia, Ageratum, Amaranthus, 
Anthirrhinum, Aquilegia, Chrysanthemum, Cineraria, Clover, Cosmo, Cowpea, Dahlia, Datura, 
Delphinium, Gerbera, Gladiolus, Gloxinia, Hippeastrum, Mesembryanthemum, Salpiglossis, 
Zinnia, and the like. In a preferred embodiment, the DNA is expressed in "seed crops 0 such 
as com, sweet com and peas etc. in such a way that the apomictic seed which results from 
such expression is not physically mutated or otherwise damaged in comparison with seed 
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from untransformed like crops. Preferred are monocotyledonous plants of the 
Gmminaceae family involving Lolium, Zea, Triticum. Triticale. Somhum. Saccharum. 
Brvmus. Orvzae. Avena. Hordeum. Secale and Setaria plants. 

More preferred are transgenic maize, wheat, barley, sorghum, rye, oats, turf and forage 
grasses, millet and rice. Especially preferred are maize, wheat, sorghum, rye, oats, turf 
grasses and rice. 

Among the dicotyledonous plants Arabidopsis, soybean, cotton, sugar beet, sugar cane, 
oilseed rape, tobacco and sunflower are more preferred herein. Especially preferred are 
soybean, cotton, tobacco, sugar beet and oilseed rape. 

The expression 'progeny 1 is understood to embrace both, "asexually" and "sexually" 
generated progeny of transgenic plants. This definition is also meant to include all 
mutants and variants obtainable by means of known processes, such as for example cell 
fusion or mutant selection and which still exhibit the characteristic properties of the initial 
transformed plant, together with all crossing and fusion products of the transformed plant 
material. This also includes progeny plants that result from a backcrossing, as long as 
the said progeny plants still contain the DNA according to the invention 

Another object of the invention concerns the proliferation material of transgenic 
plants. 

The proliferation material of transgenic plants is defined relative to the invention as any 
plant material that may be propagated sexually or asexually in vivo or in vitro. Particularly 
preferred within the scope of the present invention are protoplasts, cells, calli, tissues, 
organs, seeds, embryos, pollen, egg cells, zygotes, together with any other propagating 
material obtained from transgenic plants. 

Parts of plants, such as for example flowers, stems, fruits, leaves, roots originating in 
transgenic plants or their progeny previously transformed by means of the process of the 
invention and therefore consisting at least in part of transgenic cells, are also an object 
of the present invention. Espeically preferred a apomictic seeds. 

A further object of the invention is a method of producing apomictic seeds, but preferably 
seeds that are of the adventitious embryony type, comprising the steps of: 

(i) transforming plant material with a nucleotide sequence encoding a protein the 
presence of which in an active form in a cell, or membrane thereof, renders said cell 
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embryogenic, but preferably a protein which is a protein kinase capable of spanning a 
plant cell membrane and capable of autophosphoiylation. 

(ii) regenerating the thus transformed material into plants, or carpel-containing parts 
thereof, and 

(iii) expressing the sequence in the vicinity of the embryo sac. 

The kinase protein being expressed by the DNA according to the invention is preferably a 
leucine rich repeat receptor like kinase and comprises a iigand binding domain, a proline box, a 
transmembrane domain, a kinase domain and a protein binding domain. In a specific 
embodiment of the invention, the said kinase protein may lack a functional Iigand binding 
domain but comprises a proline box, a transmembrane domain, a kinase domain and a protein 
binding domain. 

The genetic properties engineered into the transgenic seeds and plants described above 
are passed on by sexual reproduction or vegetative growth and can thus be maintained and 
propagated in progeny plants. Generally said maintenance and propagation make use of 
known agricultural methods developed to fit specific purposes such as tilling, sowing or 
harvesting. Specialized processes such as hydroponics or greenhouse technologies can 
also be applied. As the growing crop is vulnerable to attack and damages caused by insects 
or infections as well as to competition by weed plants, measures are undertaken to control 
weeds, plant diseases, insects, nematodes, and other adverse conditions to improve yield. 
These include mechanical measures such a tillage of the soil or removal of weeds and 
infected plants, as well as the application of agrochemicals such as herbicides, fungicides, 
gametocides, nematicides, growth regulants, ripening agents and insecticides. 

Use of the advantageous genetic properties of the transgenic plants and seeds according to 
the invention can further be made in plant breeding which aims at the development of 
plants with improved properties such as tolerance of pests, herbicides, or stress, improved 
nutritional value, increased yield, or improved structure causing less loss from lodging or 
shattering. The various breeding steps are characterized by well-defined human 
intervention such as selecting the lines to be crossed, directing pollination of the parental 
lines, or selecting appropriate progeny plants. Depending on the desired properties different 
breeding measures are taken. The relevant techniques are well known in the art and include 
but are not limited to hybridization, inbreeding, backcross breeding, multiline breeding, 
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variety blend, interspecific hybridization, aneuploid techniques, etc. Hybridization 
techniques also include the sterilization of plants to yield male or female sterile plants by 
mechanical, chemical or biochemical means. Cross pollination of a male sterile plant with 
pollen of a different line assures that the genome of the male sterile but female fertile plant 
will uniformly obtain properties of both parental lines. Thus, the transgenic seeds and plants 
according to the invention can be used for the breeding of improved plant lines which for 
example increase the effectiveness of conventional methods such as herbicide or pestidice 
treatment or allow to dispense with said methods due to their modified genetic properties. 
Alternatively new crops with improved stress tolerance can be obtained which, due to their 
optimized genetic "equipment", yield harvested product of better quality than products which 
were not able to tolerate comparable adverse developmental conditions. 

In seeds production germination quality and uniformity of seeds are essential product 
characteristics, whereas germination quality and uniformity of seeds harvested and sold by 
the farmer is not important. As it is difficult to keep a crop free from other crop and weed 
seeds, to control seedborne diseases, and to produce seed with good germination, fairly 
extensive and well-defined seed production practices have been developed by seed 
producers, who are experienced in the art of growing, conditioning and marketing of pure 
seed. Thus, it is common practice for the farmer to buy certified seed meeting specific 
quality standards instead of using seed harvested from his own crop. Propagation material 
to be used as seeds is customarily treated with a protectant coating comprising herbicides, 
insecticides, fungicides, bactericides, nematicides, molluscicides or mixtures thereof. 
Customarily used protectant coatings comprise compounds such as captan, carboxin, 
thiram (TMTD*), methalaxyl (Apron*), and pirimiphos-methyl (Actellic*). If desired these 
compounds are formulated together with further carriers, surfactants or application- 
promoting adjuvants customarily employed in the art of formulation to provide protection 
against damage caused by bacterial, fungal or animal pests. The protectant coatings may 
be applied by impregnating propagation material with a liquid formulation or by coating with 
a combined wet or dry formulation. Other methods of application are also possible such as 
treatment directed at the buds or the fruit. 

It is thus a further object of the present invention to provide plant propagation material for 
cultivated plants, but especially plant seed that is treated with an seed protectant coating 
customarily used in seed treatment. 
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It is a further aspect of the present invention to provide new agricultural methods such as 
the methods examplified above which are characterized by the use of transgenic plants, 
transgenic plant material, or transgenic seed according to the present invention. 

To breed progeny from plants transformed according to the method of the present 
invention, a method such as that which follows may be used: plants produced as described 
in the examples set forth below are grown in pots in a greenhouse or in soil, as is known in 
the art, and permitted to flower. Pollen is obtained from the mature stamens and used to 
pollinate the pistils of the same plant, sibling plants, or any desirable plant. Similarly, the 
pistils developing on the transformed plant may be pollinated by pollen obtained from the 
same plant, sibling plants, or any desirable plant. Transformed progeny obtained by this 
method may be distinguished from non-transformed progeny by the presence of the 
introduced gene(s) and/or accompanying DNA (genotype), or the phenotype conferred. 
The transformed progeny may similarly be selfed or crossed to other plants, as is normally 
done with any plant carrying a desirable trait. Similarly, tobacco or other transformed plants 
produced by this method may be selfed or crossed as is known in the art in order to 
produce progeny with desired characteristics. Similarly, other transgenic organisms 
produced by a combination of the methods known in the art and this invention may be bred 
as is known in the art in order to produce progeny with desired characteristics. 

Further comprised by the invention is a method of obtaining embryogenic cells in plant material, 
comprising transforming the material with a recombinant DNA sequence or a vector according to 
the invention, expressing the sequence in the material or derivatives thereof and subjecting the 
said material or derivatives to a compound which acts as a ligand for the gene product of the 
said sequence. 

The invention further relates to a method of generating somatic embryos under in vitro 
conditions wherein the SERK protein is overexpressed ectopically. 

The invention still further includes the use of the said DNA in the manufacture of apomictic 
seeds, in which use the sequence is expressed in the vicinity of the embryo sac. 

In a specific embodiment of the invention the SERK gene may be expressed in transgenic plants 
such as, for example, an Arabidopsis plant, under the control of plant expression signals, 
particularly a promoter which regulates expression of SERK genes in planta, but preferably a 
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developmentally regulated or inducible promoter such as, for example, the carrot chitinase 
DcEP3-1 gene promoter, the Arabidopsis AtChitIV gene promoter, the Arabidopsis LTP-1 gene 
promoter, the Arabidopsis bel-1 gene promoter, the petunia fbp-7 gene promoter, the 
Arabidopsis ANT gene promoter, or the promoter of the 0126 gene from Phalaenopsis; the 
Arabidopsis AtDMCI promoter, or the pTA7001 inducible promoter. 

The promoters of the DcEP3-1 and the AtChit IV genes may be cloned and characterized by 
standard procedures. The DcSERK coding sequence (SEQ ID No. 2) is cloned behind the 
DcEP3~1 , the AtChit IV or the AtLTP-1 promoters and transformed into Arabidopsis. The ligation 
is performed in such a way that the promoter is operably linked to the sequence to be 
transcribed. This construct, which also contains known marker genes providing for selection of 
transformed material, is inserted into the T-DNA region of a binary vector such as pBIN19 and 
transformed into Arabidopsis. Agrobacterium-mediaXed transformation into Arabidopsis is 
performed by the vacuum infiltration or root transformation procedures known to the skilled man. 
Transformed seeds are selected and harvested and (where possible) transformed lines are 
established by normal setfing. Parallel transformations with 35S promoter-SERK constructs and 
the entire SERK gene itself are used as controls to evaluate over-expression in many cells or 
only in the few cells that naturally express the SERK gene. The 35S promoter-SERK construct 
may give embryo formation wherever the signal that activates the SERK-mediated transduction 
chain is present in the plant. A testing system based on emasculation and the generation of 
donor plant lines for pollen carrying LTP1 promoter-GUS and SERK promoter-bamase is 
established. 

The same constructs (35S, EP3-1, AtChitIV, AtLTP-1 and SERK promoters fused to the SERK 
coding sequence) are employed for transformation into several Arabidopsis backgrounds. These 
backgrounds are wild type, male sterile, fis (allelic to emb 173) and primordia timing (pt)-1 lines, 
or a combination of two or several of these backgrounds. The wt lines are used as a control to 
evaluate possible effects on normal zygotic embryogenesis, and to score for seed set without 
fertilization after emasculation. The ms lines are used to score directly for seed set without 
fertilization. The fis lines exhibit a certain degree of seed and embryo development without 
fertilization, so may be expected to have a natural tendency for apomictic embryogenesis, which 
may be enhanced by the presence of the SERK constructs. The pt-1 line has superior 
regenerative capabilities and has been used to initiate the first stably embryogenic Arabidopsis 
cell suspension cultures. Combinations of several of the above backgrounds are obtained by 
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crossing with each other and with lines containing ectopic SERK expressing constructs. Except 
for the ms lines, propagation can proceed by normal setting, and analysis of apomictic traits 
following emasculation, A similar strategy is followed in which the ATChilV, AtLTP-1 and SERK 
promoters are replaced by the bel-1 and fbp-7 promoters as well by other promoters specific for 
components of the female gametophyte. 

Additional constructs are generated that have constitutive receptor kinase activity. Most of the 
receptor kinases of the SERK type act as homodimeric receptors, requiring autophosphorylation 
before being able to activate downstream signal transduction cascades. In many receptor 
kinases the extracellular domain serves as an inhibitor of the kinase domain in the ligand-free 
stage. This arrest is removed after binding of the ligand (Cadena and Gill, 1992). By introduction 
of a SERK construct, from which the extracellular iigand-binding domain has been removed, 
mutant homodimeric (in cells that do not have a natural population of SERK proteins) or 
heterodimeric (in cells that also express the unmodified forms) proteins can be generated with a 
constitutively activated kinase domain. This approach, when coupled to one of the promoters 
active in the nuceilar region, results in activation of the embryogenic pathway in the absence of 
the activating signal. This may be an important alternative in cases where it is necessary or 
desirable to have activation of the SERK pathway only dependant on specific promoter activity 
and independent of temporal regulation of an activating signal. Introduction of SERK constructs 
that result in fertilization-independent-embryogenesis (fie) are tested in other species for their 
effect. In order to recognize the fie phenotype, the skilled man will use appropriate male sterile 
backgrounds. However, pollination is often necessary for apomixis of the adventitious embryony 
type, in order to ensure the production of endosperm. 

Whilst the present invention has been particularly described by way of the production of 
apomictic seed by heterologous expression of the SERK gene in the nuceilar region of the 
carpel, the skilled man will recognize that other genes, the products of which have a similar 
structure/function to the SERK gene product, may likewise be expressed with similar results. 
Moreover, although the example illustrates apomictic seed production in Arabidopsis, the 
invention is, of course, not limited to the expression of apomictic seed-inducing genes solely in 
this plant. Moreover, the present disclosure also includes the possibility of expressing the SERK 
(or related) gene sequences in the transformed plant material in a constitutive - tissue non- 
specific manner (for example under transcriptional control of a CaMV35S or NOS promoter). In 
this case, tissue specificity is assured by the localized presence within the vicinity of the embryo 
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sac of the ligand of the product of the said gene. Furthermore, the SERK (or related) gene 
products may interact with proteins such as transcription factors which are involved in regulating 
embryogenesis. This interaction within tissue which has been transformed according to the 
present disclosure is also part of the present invention. 

The skilled man who has the benefit of the present disclosure will also recognize that the SERK 
gene (and others as indicated in the preceding paragraph) may be transformed into plant 
material which may be propagated and/or differentiated and used as an expiant from which 
somatic embryos can be obtained. Expression of such sequences in the transformed tissue 
(which is subjected to a tigand of the kinase gene products) substantially increases the 
percentage of the cells in the tissue which are competent to form somatic embryos, in 
comparison with the number present in non-transformed like tissue. 

The invention will be further apparent from the following description and the associated drawings 
and sequence listings. 

SEQ ID NO. 1 depicts the Daucus carota genomic clone of the putative receptor kinase (SERK) 
associated with the transition of competent to embryogenic cells; 
SEQ ID NO. 2 depicts the cDNA of the said putative kinase; 

SEQ ID NOs, 3 depicts the the predicted protein sequence of the SERK protein encoded by 
theDNA of SEQ IDNO:1. 

SEQ ID NOs: 4-16 depict the sequences of various PGR primers; and 

SEQ ID NOs. 17-19 depict specific peptides contained within the gene product of SEQ ID NO. 2. 
SEQ ID NO: 20 depitcts the Arabidopsis thafiana partial genomic clone of the putative receptor 
kinase (SERK) associated with the transition of competent to embryogenic cells. 
SEQ ID NO: 21 depicts the predicted protein sequence of the SERK protein encoded by the 
DNA of SEQ ID NO:20. 

SEQ ID NOs: 22, 24, 26, 28 and 30 depict the partial DNA sequences of 5 EST clones with 
high homology to the SERK LRR sequences . 

SEQ ID NOs. 23, 25, 27, 29 and 31 depict the predicted protein sequence of the partial DNA 
sequences of the 5 EST clones of SEQ ID Nos: 22, 24, 26, 28 and 30. 
SEQ ID NO: 32 depicts the nuclotide sequence of the SERK cDNA from Arabidopsis thaliana. 
SEQ ID NO: 33 depicts the predicted amino acid sequence of the SERK protein from 
Arabidopsis thaliana encoded by the DNA of SEQ ID NO: 32. 
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Figure 1 shows the results of an RT-PCR experiment performed on RNA extracted from the 
indicated tissues. 40 cycles followed by Southern blotting of the resulting bands is necessary to 
visualize SERK expression. Lanes include explants at day 7, treated for less (lane 1) or more 
(lane 2) then 3 days with 2,4-D. In the original a very faint signal is visible in lane 2, but not in 
lane 1. Established embryogenic cultures (lanes 4-6) but not a non-embryogenic control (lane 3) 
express the SERK gene. In carrot plants, no expression is detectable except for developing 
seeds after pollination (lane 7). Up to day 7 after pollination, the carrot zygote remains undivided, 
suggesting that the observed signal is coming only from the zygote. At day 10, the early globular 
and at day 20 the heart stage is reached in carrot zygotic embryogenesis. No signals are seen 
on Northern blots. 

Figure 2A shows the results of a whole-mount in situ hybridization with the SERK cDNA on 7 
day explants treated for 3 days with 2,4 D. Few cells on the surface of the explant express the 
SERK gene, and those cells that do are the ones that become embryogenic. Figure 2B shows a 
whole mount in situ hybridization on a partially dissected seed containing a globular zygotic 
embryo. Hybridization is visualized by DIG staining. 

Figure 3 shows SERK expression in embryogenic hypocotyl cells during hormone-induced 

activation, determined by whole mount in situ hybridization . Ban 50 mm 

(A-E) Cell population generated by mechanical fragmentation of the activated hypocotyls. Only 

few of a certain type of cell, defined enlarged cell show SERK expression (asterisks). Small 

cytoplasmic cells (c), enlarging cells (eg) and large cells (I) never show SERK expression. 

(F) Hypocotyl longitudinal section before hormone-induced activation. It is not possible to detect 

any SERK expression in any type of cell. 

(G-l) Proliferating mass coming from the inner hypocotyl tissues 10 days after the beginning of 
the hormonal treatment (longitudinal section). In G a single enlarged cells showing SERK 
expression is detectable within a row of negative cells showing the same morphology. In H a 
single enlarged cell showing serk expression is detaching from the surface of the proliferating 
mass. In I a cluster of enlarged cells showing SERK expression is detectable at the surface of 
proliferating tissue. 

(J) Proliferating mass coming from the inner tissues of the hypocotyl 10 days after the beginning 
of the rooting treatment (24 hours with 2,4-D followed by hormone removal). Both the root 
primordia and the enlarged cells detaching from the surfac do not show any SERK expression. 
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Figure 4 shows the phenotype of Arabidopsis WS plants transformed with the 2200 bp SERK- 
lucrferase consturct at the seedling level. Pictures were taken at 28 days after germination of T2 
seeds. In plant II and III no clear shoot meristem is visible at the seedling stage, 7 days after ger- 
mination. The first two leaves, if they develop at all, are needleshaped as hown on the pictures 
taken 28 days after germination. At this time plant I, which shows no clear phenotype, already 
starts flowering. Secondary shoot meristems are already developing in plant no II and will also 
develop later from no III. Shoot meristems, influorescences and normal flowers eventually 
develop on all plants. 

Figure 5 shows how the 2200 bp SERK lucif erase construct affects the number of developing 
ovules in the siliques of transformed plants. 

Figure 6 shows autophosphorylation of purified SERK fusion protein in vitro. Lane 1 : purified 
SERK fusion protein; Lane 2: serine phosphate; Lane 3: threonine phosphate; Lane 4; thyrosine 
phosphate. 

The following description illustrates the isolation and cloning of the SERK gene and the 
production of apomictic seed by heterologous expression of the said gene in the nuceliar region 
of the carpel so that somatic embryos form which penetrate the embryo sac and are 
encapsulated by the seed as it develops. 
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ISOLATION AND CLONING OF THE SERK GENE FROM DAUCUS CAROTA 

Isolation of cDNA clones that are preferentially expressed in embryogenic cell cultures of 
carrot 

In order to increase the chance of success for obtaining genes expressed in carrot suspension 
cells competent to form embryos, the number of embryo-forming cells as present in a series of 
established cell cultures was determined. A sub-population of cells that passed through a 30 mm 
nylon sieve was isolated from eight different cultures that ranged in age between 2 months and 
4 years. In these sub 30 mm populations, the number of embryos formed from the single cells 
and small cell clusters was determined and expressed as a percentage of the total number of 
cells present at the start of embryogenesis. Sieved <30 mm cultures able to form somatic 
embryos with a frequency of more than 1% were then used as a source for competent cells, and 
cultures that produced less than 0.01% embryos were used as non-embryogenic controls. As 
main cloning strategies, cold plaque screening (Hodge etai 1992) and differential display (dd) 
RT-PCR (Liang and Pardee, 1992) were used besides conventional differential screening of 
cDNA libraries. 

Labeled probes for differential screening were obtained from RNA out of a <30 mm sieved 
sub-population of cells from either embryogenic or non-embryogenic cell cultures. Employing 
these probes in a library screen of approximately 2000 plaques yielded 26 plaques that failed to 
show any hybridization to either probe. These so-called cold plaques were purified and used for 
further analysis. From the total number of plaques that did hybridize, about 30 did so only with 
the probe from embryogenic cells. ddRT-PCR reactions using a combination of one anchor 
primer and one decamer primer were performed on mRNA isolated from three embryogenic, and 
three non-embryogenic suspension cultures. About 50 different ddRT-PCR fragments were 
obtained from each reaction. Using combinations of three different anchor and six different 
decamer primers, a total of approximately 1000 different cDNA fragments was visualized. Six of 
these PCR fragments were only found in lanes made with mRNA from <30 mm populations of 
cells from embryogenic cultures (Table 1) and with oligo combinations of the anchor primer (5- 
I II II I I I I I IGC-3') and the decamer primers (5-GGGATCTAAG-3'), (5-ACACGTGGTC-3'), 
(5-TCAGCACAGG-3'). Because differential PCR fragments often consist of several unresolved 
cDNA fragments (Li et al. 1994), cloning proved to be essential prior to undertaking further 
characterization of the PCR fragments obtained. 
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All clones obtained were subjected to a second screen, that consisted of spot-dot Northern 
hybridization performed under conditions of high stringency. This method, that used RNA from 
entire unsieved embryogenic and non-embryogenic suspension cultures, proved to be a fast and 
reliable additional selection method. Only one clone (22-28) of the 30 clones obtained after 
differential screening, proved to be restricted to embryogenic cell cultures while the majority was 
constitutively expressed. The 26 clones obtained from the cold plaque screening required long 
exposure times in the spot-dot Northern analysis. Six of these clones failed to show any 
hybridization signal and 19 proved to be expressed in both embryogenic and non-embryogenic 
cell cultures. One clone (31-50) showed low expression in all embryogenic cultures, and in one 
non-embryogenic culture, but not in the others. Of the six cloned fragments obtained by ddRT- 
PCR display, four showed hybridization more or less restricted to transcripts present in 
embryogenic cultures. All clones that passed through the second screening were sequenced. 
Two of the ddRT-PCR clones (6-8 and 7-13) were identical to the carrot Lipid Transfer Protein 
(LTP) gene, previously identified as a marker for embryogenic carrot cell cultures. LTP 
expression is restricted to embryogenic cell clusters and the protoderm of somatic and zygotic 
embryos from the early globular stage onwards (Sterk et al, 1991). Therefore, while the LTP 
gene is not a marker for competent cells, its appearance in the screening confirms the validity of 
our methods with respect to the cloning of genes expressed eariy during somatic 
embryogenesis. 

cDNA clone 31-50 encodes a leucine-rich repeat containing receptor-like kinase 
The mRNA corresponding to the isolated clone 31-50 had an open reading frame of 1659 
nucleotides encoding a protein with a calculated Mw of 55 kDa. Because clone 31-50 is mainly 
expressed in embryogenic cell cultures it was renamed Somatic Embryogenesis Receptor 
Kinase (SERK). The SERK protein contains a N-terminal domain with a five-times repeated 
leucine-rich motif that is proposed to act as a protein-binding region in LRR receptor kinases 
(Kobe and Deisenhofer, 1994). Between the extracellular LRR domain of SERK and the 
membrane-spanning region is a 33 amino acid region rich in prolines (13), that is unique for the 
SERK protein. Of particular interest is the sequence SPPPP, that is conserved in extensins, a 
class of universal plant cell wall proteins (Vamer and Lin, 1989). The proposed intracellular 
domain of the protein contains the 1 1 subdomains characteristic of the catalytic core of protein 
kinases. The core sequences HRDVKAAN and GTLGYIAPE in respectively the kinase 
subdomains VB and VIII suggest a function as a serine / threonine kinase (Hanks et a/. 1988). 
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Another interesting feature of the intracellular part of the SERK protein is that the C-terminal 24 
amino-acids resembles a single LRR. The serine and threonine residues present within the 
intracellular LRR sequence are surrounded by acidic residues and might be targets for the 
autophosphorylation of SERK, thereby regulating the ability of other proteins to interact with this 
receptor-kinase in a similar fashion as described for the SH2 domain of the EGF family of 
tyrosine receptor kinases. 

Hybridization of the SERK cDNA clone to the carrot genome revealed the presence of only a 
single main hybridizing band after digestion with EcoR1 , probably reflecting a single SERK gene 
in the carrot genome. This was confirmed after digestion with Ddel, an enzyme that cuts three 
times within the SERK gene. No signal was observed after Northern blotting of mRNA from 
embryogenic cell cultures and hybridization with labeled SERK probes, reflecting the low levels 
of transcript present in these cultures. Detection of the SERK transcript on the original spot-dot 
Northerns was only possible after long exposure times compared with other probes. 

The ability of the SERK protein to autophosphorylate was investigated in vitro, using a previously 
described autophosphorylation assay (Mu et al. 1994), with a bacterial fusion protein that 
contained the complete intracellular region of the SERK protein. The bacterially expressed 
SERK fusion protein was able to autophosphorylate, indicating that the SERK protein is able to 
fulfill a role as a protein kinase in vivo (Heldin, 1995). 

Expression of the SERK gene corresponds with the first appearance of competent ceils 
during hypocotyl activation 

When carrot hypocotyls are induced with 2,4-D, only the cells of the provascular tissue 
proliferate. Cells of epidermal and cortical origin merely expand, suggesting that the provascular 
tissue derived cells form the newly initiated suspension culture. After removal of 2,4-D, the 
formation of somatic embryos occurs after 2-3 weeks. Somatic embryos are preceded by 
embryogenic cells, that are developed in turn from competent cells. While competent and 
embryogenic cell formation take place in the presence of 2,4-D, it was not clear when this 
occurred, and which cells acquired competence. Since previous experiments (Toonen et al. 
1994) revealed that cell morphology is not a good criterion, the first appearance of single 
competent cells was determined experimentally by semi-automatic cell tracking performed on 
large populations of immobilized cells. Hypocotyl explants activated with 2,4-D for seven days 
were mechanically fragmented and samples of the resulting population of mainly single 
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suspension cells were immobilized to allow recording of their development by cell tracking. In the 
immobilized cell populations obtained in this way all the morphologically discernible cell types 
were present that were also seen in the un-fragmented activated hypocotyls. Because the 
different cell morphologies observed during hypocotyl activation were known (Guzzo et a/. 
1995), it was possible to trace back the original position of each type of cell in the activated 
explant. Small cytoplasm-rich cells (16x16 mm) are the proliferating cells that surround the 
vascular elements. Enlarging vacuolated cells (16x40 mm) are encountered on the surface of 
the mass of proliferating cells and these can detach from the surface when fully enlarged (35x90 
mm). Large vacuolated cells (more than 60x140 mm) are the non-proliferating remnants of the 
hypocotyl epidermis and cortical parenchyma. The shape of the enlarging and fully enlarged 
cells could change from oval to elongate or triangular. Cell tracking on a total of 24,722 cells 
released from seven days activated hypocotyls showed that only 20 single cells formed a 
somatic embryo. Because of their dependance on continued 2,4-D treatment, the embryo- 
forming single cells are still in the competent cell stage. All of the embryo-forming single cells 
belonged to the category of 3 T 511 enlarged cells, that contained therefore competent cells in a 
frequency of 0.56%. The single cell tracking experiments clearly reveal that the ability of explant 
cells to reinitiate cell division under the influence of 2,4-D, resulting in a population of highly 
cytoplasmic and rapidly proliferating cells, does have a causal relation with the ability to become 
embryogenic. It is also clear that only a very limited number of the cells that make up the newly 
initiated embryogenic suspension culture are actually competent to form embryogenic cells. 

Expression of the SERK gene, determined by whole mount in situ hybridization on a similar 
population of cells as used for the cell tracking experiments, was found to be restricted to only 
0.44% of the enlarged cells. Therefore, the expression of the SERK gene appears closely 
correlated both qualitatively and quantitatively with the presence of competent single cells. 

To obtain insight into the temporal regulation of SERK expression in the course of explant 
activation, whole mount in situ hybridization was performed on entire intact or hand-sectioned 
explants treated for different periods with 2,4-D. Representative samples were collected at daily 
intervals from explants untreated and treated for three days, six days, seven days or ten days 
with 2,4-D before returning to B5-0. No SERK-expressing cells were ever found in explants 
treated for less then three days with 2,4-D. While enlarged cells became present after the first 
five days of culture, the first few SERK-expressing enlarged cells were found after six-seven 
days of culture in the presence of 2,4-D treatment These few cells were present at the surface 
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of the mass of proliferating cells originating from the provascular tissue. In the hypocotyls treated 
for ten days with 2,4-D, the number of SERK-positive cells had increased to 3.04% and included 
at this stage also cells present in small clusters. No SERK transcript was ever detected in small 
cytoplasm-rich cells or large vacuolated cells, Hypocotyls were also treated for only one day with 
2,4-D and subsequently cultured in hormone-free medium for a total of seven or ten days. Under 
these conditions explant cells proliferated and gave rise to roots and non-embryogenic cell 
cultures, while SERK expression could never be detected. The in situ hybridization results 
described above were obtained from a relatively small number of explants and a few hundred 
cells, so RT-PCR followed by Southern hybridization was performed to obtain more quantitative 
results. These are shown in Figure 7 and confirm the close temporal correlation between the first 
appearance of competent cells in explants treated for three days with 2,4-D and the expression 
of the SERK gene. Northern hybridization never gave any signal after hybridization with SERK 
cDNA probes, not even after prolonged exposure in a Phosphortmager, in line with the 
extremely restricted expression pattern of the SERK gene. 

Expression of the SERK gene corresponds with the occurrence of competent cells in 
established embryogenic cell cultures 

While the results described so far indicate that competent and embryogenic cell formation is 
restricted to a particular class of enlarged cells during explant activation, the situation in an 
established embryogenic cell culture is more complex. Competent single cells in such cultures 
do not appear to belong to one cell type in particular, but have been shown to originate from ail 
morphologically different cell types. In cell tracking experiments, embryogenic cells, that do not 
require exogenous auxin treatment, were never observed to be single but consisted of clusters 
of at least 3-4 cells (Toonen et al. 1994). SERK expression was found in all morphologically 
discernible single cell types that were present in an embryogenic cell culture at a frequency 
between 0.1 and 0.5% depending on the cell type. In non-embryogenic cultures, SERK 
expressing cells were never encountered. As was observed in the activated explants, SERK 
expression was not restricted to single cells, but also occurred in small clusters of 2 to 16 cells. 
Since clusters of this size are known to consist of embryogenic cells, these data show that 
SERK expression is not restricted to competent single cells, but may persist in small dusters of 
embryogenic cells. No SERK expression was encountered during the late globular, heart and 
torpedo-stages of somatic embryogenesis. 
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The SERK gene is transiently express d in zygotic embry genesis 
The xpression of the SERK gene in carrot plants was determined by RT-PCR. The results 
indicate that no SERK mRNA accumulates in any of the adult plant organs nor in flowers prior to 
pollination. The first occasion when SERK expression can be detected is in flowers at three days 
after pollination (DAP), at which stage fertilization has taken place and endosperm development 
has commenced. SERK mRNA remains present in flowers up to twenty DAP, corresponding 
with the early globular stage of the zygotic embryo (Yeung et al. 1996). Whole mount in situ 
hybridization on partially dissected carrot seeds confirmed that the SERK gene was only 
expressed in early embryos up to the globular stage. Expression was observed in the entire 
embryo including the suspensor. No expression was seen in seedlings, roots, stems, leaves, 
developing and mature flower organs, pollen grains and stigma's before and after fertilization. 
Tissues in the developing seed such as seed coat, integuments, all embryo sac constituents 
before fertilization as well as the endosperm at all stages of development investigated did not 
show any SERK expression. Later stages of carrot zygotic embryos were also completely devoid 
of SERK mRNA. Given this pattern of expression, that is restricted to the zygotic embryo, the 
signal as detected by RT-PCR in flowers at 3 and 7 DAP must come from SERK mRNA as 
present in zygotes, because in carrot the zygote remains undivided up to one week after 
pollination (Yeung et al. 1996). Although SERK expression persists to slightly later stages in 
zygotic globular embryos when compared to the somatic ones, these results confirm the 
transient pattern of expression as observed for the SERK gene during somatic embryogenesis 
and also imply that there is a correspondence between the formation of competent cells in vitro 
and the formation of the zygote in vivo. 



METHODS 

Cell culture, hypocotyl explant induction and cell tracking 

Cell cultures were derived from Daucus carets cv. Flakkese and maintained as previously 
described (De Vries et al. 1988a). Cell suspension cultures were maintained at high cell density 
in B5 medium (Qamborg et al. 1968) supplemented with 2 mM 2,4-D (B5-2 medium)* Embryo 
cultures with globular, heart and torpedo-stage somatic embryos were derived from <30 mm 
sieved cell cultures cultured at low cell density (100 000 cells / ml) in B5 medium without 2,4-D 
(B5-0). For hypocotyl explant induction experiments, plantlets were obtained from seed of 
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Daucus carota cv. S Valery as described previously (Guzzo et a/. t 1994). The hypocotyls of one 
week old plantlets were divided in segments of 3-5 mm, incubated for various periods of time in 
B5-2 medium and returned to B5-0 medium. Seven days after explantation and exposure to 2,4- 
D the hypocoty! segments were fragmented on a 170 mm sieve and the resulting cells collected 
to form a fine cell suspension. Immobilization of these cells in B5-0.2 medium was performed in 
a thin layer of phytagel (Toonen et al. 1994). After one week of further culture 2,4-D was 
removed by washing the plates with B5-0 medium. This allowed embryos to develop beyond the 
globular stage. Recording the development of the immobilized cells was performed with a 
procedure modified from the previously described by Toonen et aL (1994). The main change 
involved a new MicroScan program for automatic 3-axis movement to scan all cells in the 
phytagel (Toonen et al. 1996). 

Nucleic acid isolation and analysis 

RNA was isolated from cultured cells and plant tissues as described by De Vries et aL (1988b), 
Poly(A) + -RNA was obtained by purification by oligo (dT) cellulose (Biolabs). For RNA gel blot 
analysis samples of 10 mg total RNA were eiectrophoresed on formamide gel, and transferred to 
nytran-plus membranes. For RNA spot-blot analysis 5 mg of total RNA was denatured and 
spotted onto nytran-plus filters using a hybridot manifold (BRL). 

Genomic DNA was isolated according to Sterk et aL (1991). Samples of 10 mg genomic DNA 
were digested with different restriction enzymes and separated on agarose gel, and transferred 
to nytran-plus membrane (Schleicher & Schuell). Hybridization of RNA blots took place at 42°C 
in hybridization buffer containing 50% formamide, 6xSSC, SxDenhardt, 0.5% SDS and 0.1 
mg/ml salm sperm DNA. Hybridization of DNA blots was performed as previously described 
(Sterk et al. 1991). Following hybridization, filters were washed under stringent conditions (3x20 
min in 0.1% SSC, 1% SDS, at 65°C). Filters were exposed to Kodak XOmat AR film. The 
integrity and the amount of RNA on the blots was confirmed by hybridization with an 18S 
ribosomal RNA probe. Nucleotide sequence analysis was performed on an ABI 373A automated 
DNA sequencer (Applied Biosystem). 

Screening procedures 

Two independent cDNA libraries were constructed with equal amounts of poly(A)*-RNA from 
total established cell cultures grown for six days in B5-2 medium, sieved <125 mm cell cultures 
grown for six days in B5-0 medium and sieved <30 mm cell cultures grown for six days in B5-0 
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medium. cDNA synthesis and cloning into the Uni-ZAP XR vector was performed according to 
the manufacturers protocol (Stratagene). 

Differential screening of the cDNA libraries was performed essentially as described by Scott et 
al. (1991). RNA was isolated from either three embryogenic or three non-embryogenic cell 
cultures, that were grown for seven days in B5-2 after sieving through 30 mm mesh. First strand 
cDNA synthesis was performed on 4 mg total RNA using AMV reverse transcriptase (Gibco 
BRL). [^PJdATP labeled probes were prepared using random prime labeling on first strand 
cDNA. Pooled probes from embryogenic and non-embryogenic cell populations were hybridized 
to two pairs of nitrocellulose filters, each containing 1000 plaques from one cDNA library. After 
washing for 3x20 min in 0.1% SSC, 1% SDS at 65°C ? hybridization was visualized by 
autoradiography for two days on Kodak X-omatic film. Plaques that only showed signal with the 
embryogenic transcript probe were purified by two further rounds of screening. 

In order to identify cDNA clones which are expressed at low levels in the <30 mm sieved cell 
population, cold plaque screening was performed as described by Hodge et al. (1992). Plaques 
from the differential screening that did not show any signal after seven days of autoradiography 
were purified by two further rounds of screening. The resulting clones were used as probes for 
characterization of the expression pattern of the corresponding genes. 

Differential Display RT-PCR 

Differential display of mRNA was performed essentially as described by Liang and Pardee 
(1992). cDNA synthesis took place by annealing 1 mg of total RNA in 10 ml buffer containing 
200 mM KCI, 10 mM Tris-HCI (pH 8.3), and 1 mM EDTA with 100 ng of one of the following 
anchor primers: (5'-l II 1 1 I 1 1 1 1 IGC-3*), (5'-l I I I I 1 1 1 1 1 ICTG-3'), (5*- 1 I 1 1 II I I 1 1 I CA-3'). 
Annealing took place by heating the mix for 3 min. at 83°C followed by incubation for 30 min at 
42°C. Annealing was followed by the addition of 15 ml pre-warmed cDNA buffer containing 16 
mM MgCI 2t 24 mM Tris-HCI (pH 8.3), 8 mM D7T, 400 mM dNTP, and 4 Units AMV reverse 
transcriptase (Gibco BRL). cDNA synthesis took place at 42°C for 90 min. First strand cDNA was 
phenol/chlorophomn extracted and precipitated with ethanol using glycogen as a carrier. The 
PCR reaction was performed in a reaction volume of 20 ml containing 10% of the synthesized 
cDNA, 100 ng of anchor primer, 20 ng of one of the following 10-mer primers: (5- 
GGGATCTAAG-3') . (5 , -TCAGCACAGG-3 , ) t (S'-GACATCGTCC-ff), (S'-CCCTACTGGT-S 1 ), (5*- 
ACACGTGGTC-3'), (5-GGTGACTGTC-3'), 2 mM dNTP, 0.5 UnitTag enzyme in PCR buffer (10 
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mM Tris-HCI (pH 9.0), 1.5 mM MgCI 2 , 50 mM KCI, 0.01% gelatin and 0.1% Triton X100) and 6 
nM [a- 22 ?] dATP (Amersham). PCR parameters were 94°C for 30 sec, 40°C for 1 min, and 72°C 
for 30 sec for 40 cycles using a Cetus 9600 (Perkin-Elmer). Amplified and labeled cDNAs were 
separated on a 6% denaturing DNA sequencing gel. Gels were dried without fixation and bands 
were visualized by 16 hours of autoradiography using Kodak X-omatic film. Bands containing 
differentially expressed cDNA fragments of 150-450 nucleotides were cut out of the gel and 
DNA was extracted from the gel slices by electroelution onto DE-81 paper (Whatmann). After 
washing of the paper in low salt buffer (100 mM LiCI 2 in 10 mM TE buffer), and eiution of the 
cDNA in high salt buffer (1 M LiCI 2 in 10 mM TE buffer with 20% ethanol) the cDNA was 
concentrated by precipitation in ethanol using glycogen as carrier. Reampirfication of the cDNA 
fragments using the same PCR cycling parameters as described above but PCR buffer 
containing 2.5 mM of both the 10-mer and the anchor oligo and 100 mM dNTP. DE-81 paper 
allowed an efficient recovery of the DNA fragments and reamplification generated an average of 
500 ng DNA after 40 cycles. Amplified PCR products were blunt-ended using the Klenow 
fragment of E.coti DNA Polymerase I (Pharmacia), purified on Sephacryl-S200 columns 
(Pharmacia), ligated into a Smal linearized pBluescript vector II SfC (Stratagene) and 
transformed into E.coli using electroporation. 

RT-PCR 

Adult plant tissues from Daucus carota were obtained from S&G Seeds (Enkhuizen). Controlled 
pollination was performed by hand. Flower tissue RNA was obtained from three compete umbels 
for each time-point and contained all flower organs including pollen grains. 2 mg of total RNA 
from adult plant tissue or cell cultures was annealed at 42°C with 50 ng oligo (5 1 - 
TCTTGGACCAGATAATTC-3') in 10 ml annealing buffer (250 mM KCI, 10 mM Tris-HCI pH 8.3, 
1 mM EDTA). After 30 min. annealing, 1 unit AMV-reverse transcriptase was added in a volume 
of 15 ml cDNA buffer (24 mM Tris-HCI pH 8,3 t 16 mM MgCI 2 , 8 mM DTT, 0.4 mM dNTP). The 
reverse transcription reaction took place for 90 min. at 42°C. PCR amplification of SERK-cDNA 
was carried out with two specific oiigos for the SERK kinase domain, (5- 
CTCTGATGACTTTCCAGTC-3 , ) and (S'-AATGGCATTTGCATGG-S 1 ). Amplification was carried 
out with 30 cycles of 30 sec. at 94°C t annealing at 54°C for 30 sec. and extension at 72°C for 1 
min M followed by a final extension for 1 0 min.at 72°C. 
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Whole mount in situ hybridizati n 

Whole mount in situ hybridizations were performed essentially as previously described (Engler et 
aL 1994). Cell cultures and somatic embryos were immobilized on poly-L-lysine coated glasses 
during fixation to improve handling. Whole mount in situ hybridization on explants took place by 
embedding hypocotyls from seven-days old plantlets in 3% Seaplaque agarose (Duchefa) and 
processing them in Eppendorf tubes. Transverse as well as longitudinal sections were made 
with a vibrotome (Biorad Microcut). Sections of 50-170 mm thick were incubated in B5-2 medium 
for a minimum of three days to induce formation of embryo-forming ceils. Optimal induction was 
achieved with longitudinal hypocotyl sections with a thickness of at least 90 mm. To obtain 
proliferating, non-embryogenic cell cultures, hypocotyl sections were exposed to 2,4-D for only 1 
day, and subsequently transferred to B5-0 medium (Guzzo et a/. 1994). Whole mount in situ 
hybridization on developing seeds was performed by removing the chatazal end of the seeds to 
allow easier probe penetration. After hybridization, the enveloping layers of integuments and 
endosperm were carefully removed to expose the developing embryos. In situ hybridization on 
sections was performed as described previously (Sterk et al. 1991) except for the use of non- 
radioactive probes. 

All samples were fixed for 60 min. in PBS containing 70 mM EGTA, 4% paraformaldehyde, 
0.25% glutaraldehyde, 0.1% Tween 20, and 10% DMSO. Samples were then washed, treated 
with proteinase K for 10 min, again washed and fixed a second time. Hybridization solution 
consisted of PBS containing 0.1% Tween 20, 330 mM NaCI, 50 mg/ml heparin, and 50% 
deionized formamide. Hybridization took place for 16 hours at 42°C using digoxigenin-labeled 
sense or antisense riboprobes (Boehringer Mannheim). After washing the cells were treated with 
RNaseA, and incubated with anti-digoxigenin-alkaline phosphatase conjugate (Boehringer 
Mannheim) which had been preabsorbed with a plant protein extract. Excess antibody was 
removed by washing followed by rinsing in staining buffer (100 mM Tris-HCI pH 9.5, 100 mM 
NaCI, 5 mM MgCI 2 , 1 mM levamisole) and the staining reaction was performed for 16 hours in a 
buffer containing NBT and BCIP. Observations were performed using a Nikon Optiphot 
microscope equipped with Nomarski optics. 

Autophosphorylation assay 

A 1.4 kB Sspl cDNA fragment of the SERK cDNA encoding most of the open reading frame 
apart from the N-terminal three LRRs was cloned into the pGEX expression vector (Pharmacia). 
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A fusion protein consisting of SERK and the glutathione S-transferase gene product was 
synthesized by a three hours induction of transformed E.coli with 2 mM IPTG. Fusion protein 
was isolated and purified as described previously (Horn and Walker, 1994). Purified fusion 
protein was coupled to glutathione agarose beads (Sigma) and incubated for 20 min. at 20°C in 
a volume of 10 ml buffer. 50 mM Hepes (pH 7.6), 10 mM MgCI 2 , 10 mM MnCI 2 , 1 mM DTT t 1 
mCi [y -^P] (3 000 Ci/mmol) . Excess label was removed by washing the fusion 
protein/glutatione agarose beads three times for 5 min. in 50 mM Tris-HCI (pH 7.3), 10 mM 
MgCI 2 at 4°C. Protein was removed from the beads by cooking in SDS-PAGE loading buffer. 
Equal amounts of protein were separated by SDS-PAQE and protein autophosphorylation was 
visualized by autoradiography. 

SERK fusion proteins produced with the Baculovirus expression system. 

Further fusion proteins containing the intracellular part of the Daucus carota SERK protein 
(1.0 kB Hindlll / Sspl fragment of the carrot SERK cDNA clone 31-50) were made using the 
baculorvirus vector pAcHLT. 

in vitro phosphorylation studies with this purified protein showed that most if not all of the 
autophosphorylation of this SERK fusion protein was at threonine residues (Figure 6) 

Construction of viral transfer vectors 

The pAcHLT-B and pAcHLT-C baculovirus transfer vectors were used for the cloning of two 
cDNA fragments of the carrot SERK gene. The Sspl 1 .41 kB fragment of carrot DcSERK 
cDNA was cloned into the Smal site of pAcHLT-B and the Sspl / Pvull 1 .07 kB fragment of 
carrot DcSERK cDNA was cloned into the Smal site of pAcHLT-C. The first construct 
contains the complete C-terminal part of the DcSERK protein and from the putative 
extracellular region the proline-rich region and three of the lecuine-rich repeats. The second 
construct contains only the putative intracellular region of the DcSERK gene product. 
Nucleotide sequence analysis was performed in order to confirm the presence and the 
orientation of the DcSERK cDNA within the vector. 

Transformation of insect cells 

The resulting transfer vectors were used to transfect (lipofect) insect cell culture Sf21 from 
Spodoptera f rugiperda in combination with linearized AcMNPV baculovirus DNA. 
Monolayers of SF21 cells were transfected in 35 mm petridish s containing 2 ml of Hink's 
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medium. One microgram of linearized AcMNPV baculovirus DNA (Baculogold t Invitrogen) 
was added to 5 microgram of pAcHLT / SERK vector construct in 25 microliter of water. 
Fifteen microliter of Lipofectin (BRL) was mixed with 10 microliter of water, after which the 
DNA solution was added. After mixing 200 microliter of Hink's medium was added to the mix 
and the solution was transferred to the cell monolayer, from which the medium was 
removed. After one hour, 500 microliter of Hink's medium was added and the cells were 
incubated for anotehr 3 hours. Finally, 1 ml of Hink's medium with 20% foetal bovine serum 
(FBS) was added and the cells were incubated for 4 days. After transfection, the viral 
infection could be identified by the reduced growth of cells, the swollen shape and the 
enlarged nucleus. After four days, infected cells were harvested and the medium containing 
infectious budded virus was collected and used for plaque assays and amplification of 
recombinant virus stocks. 

Isolation of single recombinant viruses 

Single recombinant virus plaques were isolated from monolayers of cells infected with a 
titration range of the primairy virus stock. Infections was performed in 35 mm petridishes 
with monolayers of cells. Virus stocks were diluted in 600 microiieter of Graces medium and 
added to the cell monolayer, followed by a 90 minutes incubation period at in Graces 
medium with 20% FBS. Afterwards, 3% Sea Plaque agarose was autoclaved, mixed with an 
equal amount of 2x Graces medium with 20% FBS and from the resulting agarose overlay 
solution 2 ml. was spread over the cell monolayers after removal of the viral inoculum. After 
4 days of incubation single plaques could be visulalized and purified for further analysis. 

Fusion protein production. 

After determining the titer of purified recombinant viruses, monolayers of Sf21 cells in 75 
cm 2 flasks were infection with a multiplicity of infection (MOI) of 10. Incubation of cells with 
the virus inoculum was performed for 90 min. after which 8 ml. of Hink's medium with 
10%FBS was added. After 3 days of incubation, cells were harvested and washed twice 
with PBS. Cells were lysed for 45 min on ice in twenty volumes of 1x insect cell lysis buffer 
(10 mM Tris pH 7.5, 130 mM NaCI, 1% Triton, 100 mM NaF, 10 mM NaPi, 10 mM NaPPi, 
with proteinase inhibitors: 16 mg/l benzamidine, 10 mg/l phenanthroline, 10 mg/l aprotinin, 
10 mg/l leupeptin, 10 mg/l pepstatin A, 1 mM PMSF). 
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The lysate was cleared by centrifugation at 10.000 g for 30 min and the supernatant was 
batchwise incubated in TALON resin (with high affinity for the 6xHIS tag of the recombinant 
fusion protein). Binding was performed by gentle agitation for 20 min. at room temp. The 
resin was washed three times with lysis buffer, followed by an elution step with lysis buffer 
with 200 mM imidazole. Purified fusion protein was collected and purifty and integrity was 
tested by SDS-PAGE. 

Autophosphorylation assays 

Protein kinase activity was deternined by incubating 1 microgram of purified fusion protein 
for 30 min. at room temp, in a buffer containing 10 mM MgCl2, 10 mM MnCI2, 1 mM DTT 
and 10 pM [gamma-32)ATP (10 5 pm/pmol ATP). The autophosphorylated fusion protein 
was purified after SDS-PAGE from the gel in a buffer containing 50 mM NH4CQ3, 0.1% 
SDS, 0.25% beta-mercaptoethanol. Protein was precipitated with 20 pg/ml BSA and 20% 
(w/v) solid trichloroacetic acid. The precipitate was collected after centrifugation, hydrolysed 
in 50 m< 6N HCI for 1 hour at 120 degrees Celcius. HCI was subsequently removed by 
lyophilization and the pellet was resuspended in a buffer consiting of 2.2% formic acid and 
7.8% acetic acid. Hydrolysed protein was loaded onto cellulose thin layer chromatography 
plates together with control amino acid samples (phosphoserine, phosphothreonine, 
phosphotyrosine). Chromatography was performed in a buffer containing propionic acid: 1M 
ammonium hydroxide: isopropyl alcohol (90:35:35 v/v/v). After separation and drying of the 
plates, the separated amino-acids were visualized by spraying with 0.25% ninhydrin in 
aceton, followed by heating for 5 min. at 65 degrees Celcius. Plates were afterwards 
exposed to Phospho Imager casettes in order to detect the phospho-labeled arninoactds. 

SERK antibodies 

Purified fusion proteins (10 pg) were mixed in complete Freund adjuvant and injected IP 
into BALBc mice. After 4 weeks booster antigen was injected (10 pg purified fusion protein 
in imcomplete Preund adjuvant). Two weeks later a final booster was injected. One week 
after the final booster, serum was collected from these mice. The specificity and the titer of 
the resulting sera was tested on Western blots using total insect cell extracts with or without 
the SERK fusion proteins. 
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INTRODUCTION OF THE SERK GENE INTO PLANT A AND THE PRODUCTION OF 
APOMJCTIC SEED 

Carrot transformation with a SERK promoter fraament/luciferase gene fusion 
The binary vector pMTSOO is based on the pBIN1 9 vector (Bevan, 1984) and contains the 
firefly luciferase gene downstream of a polylinker containing 5 unique restriction sites was 
created by uni-directional ligation of the firefly luciferase coding region followed by the 
polyadenylation sequence from the pea rtbcS::E9 gene in the H/ndlll-Xbal site of the binary 
vector pMOGBOO (kindly provided by Mogen N.V M Leiden, The Netherlands). The binary 
vector pMOGBOO is based upon pBIN19 (Bevan, 1984) but while in pBIN19 the polylinker is 
flanked by the left border and the neomycin phosphotransferase (NPT II) expression 
cassette, the polylinker in pMOG800 is flanked by the right border and the NPT II 
expression cassette. From a genomic lambda clone, transcription regulating sequences 
from the carrot SERK gene were isolated by digestion with W/ndlll and Dral (SEQ ID No. 1), 
and cloned into the Hindlll / Smal sites of pBluescript SK+. From the resulting vector a Kpnl 
/ Sstl fragment containing the SERK genomic DNA was isolated and cloned into the Kpnl / 
Sstl sites of the binary vector pMTSOO. The resulting DNA construct, pMT531, contained the 
2200 bp genomic SERK DNA fragment as promoter sequence, the luciferase gene as vital 
reporter, and the E9 transcription terminator sequence. 

The binary vector pMT531 was transformed by electroporation into Agrobacterium 
tumefaciences strains MOG101 and MOG301 (for transformation into carrot cells) and into 
Agrobacterium tumefaciences strain C58C1 (for transformation into Arabidipsis thaliana 
plants). Transformed colonies were selected on LB plates with 100mg/l kanamycin. 

Transformation of carrot cells 

The firefly luciferase coding sequence under control of the genomic carrot Hindlll / Dral 
2200 bp DNA fragment was introduced into carrot cells by Agrobacterium tumefaciens 
mediated transformation of hypocotyl segments. Transformation of Daucus carota cv. 
'Amsterdamse bak' was performed by slicing one week old dark grown seedlings into 
segments of 10 to 20 mm. Segments were incubated for 20 minutes in a freshly prepared 
10 fold diluted overnight culture of Agrobacterium.. The segments were dried and 
transferred to a modified Gamborgs B5 medium (P1 medium; S&G seeds, Enkhuizen, The 
Netherlands) suppl mented with 2 mM 2,4-D (P1-2) and solidified with agar (Difco, Detroit, 
Mi, USA ). After two days of culture in the dark at 25 ± 0.5 _C, segments were transferred to 
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solidified P1-2 medium supplemented with kanamycin (100 mg •!"'), carbeniciliin (500 mg-r 
1 ; Duchefa) and vancomycin (100 mg-l* 1 ; Duchefa). After three weeks segments were 
transferred to fresh plates and transformed calli were selected after an additional three 
weeks. Transformed calli were grown on P1-2 plates with antibiotics for 3 weeks at a 16 
hour light/8 hour darkness regime. Transformed embryogenic suspension cultures were 
initiated as described by transferring 0.2 g callus to 10 ml liquid P1-2 medium 
supplemented with 200 mg-l" 1 kanamycin, 250 mg-l" 1 carbeniciliin and 50 mg-l -1 
vancomycin. During the first weeks 1 to 3 volumes of fresh medium were added to the 
culture at weekly intervals. After 5 to 7 weeks cultures were subcultured to a packed cell 
volume of 2 ml per 50 ml medium every two weeks and incubated at a 1 6 hour light / 8 hour 
darkness regime at 25 ± 0.5 °C. 

One week after transfer to kanamycin selection medium, hypocotyl segments were sprayed 
with luciferin to test whether luciferase expression could be detected in transformed callus 
shortly after transformation. A large number of hypocotyl segments showed luciferase 
activity at the cut edges, but did not develop calli. Instead, growth of bacteria occurred, 
suggesting that the luciferase activity was of bacterial origin. Six to ten weeks after 
transformation, calli were obtained that showed luciferase activity in variable amounts, while 
no bacterial growth could be observed anymore. After 12 weeks, calli measuring 5 to 10 
mm in diameter were used to start suspension cultures. At this time no bacterial 
contamination was observed. A control transformation experiment in which luciferase 
expression under influence of the CaMV 35S promoter was observed in single cells and cell 
clusters in the suspension culture demonstrating that the luciferase protein is active in 
Daucus carota suspension cultured cells. 

Cell immobilisation 

One-week old high-density (10 6 - 10 7 cells-ml" 1 ) suspension cultures were sieved through 
nylon sieves with successive 300, 125, 50 and 30 pore sizes (Monodur-PES; Verseidag 
Techfab, Walbeck, Germany). Single cells and cell clusters passing the last sieve are 
designated as < 30 pm populations. Control experiments with untransformed cells were 
performed with Daucus carota cv. Trophy' (S&G seeds) suspension cultures grown in P1-2 
medium. Size fractionated cell populations smaller then 30 pm were immobilised in phytagel 
(P8196; Sigma, St Louis, Mo, USA) in petriperm dishes (Heraeus, Hanau, Germany). The 
bottom layer consisted of 1 ml P1-0 medium with 5 mM Ca 2+ and 0.2 % phytagel. Two 
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hundred thousand cells (< 30 jam and < 50 pm populations) in B5-0 medium without Ca 2+ 
supplemented with 0.1 % phytagel were poured on top of the bottom layer. For this layer B5 
was applied since, at room temperature, phytagel solidified in P1 medium without Ca 2+ . 
After 2 hours of solidification an additional P1-0 layer with 0.2 % phytagel was poured onto 
the cell layer preventing the B5 layer to move. To prevent dehydration of the phytagel layers 
and to supply luciferin to the cells, 0.5 ml P1-0 medium containing 0.05 luciferin 
(Promega, Madison, Wi, USA) was added after solidification. The final luciferin 
concentration in the culture was 0.02 pM. Luciferin detection on single cells was determined 
with a CCD camera for a period of 5 times one hour (Schmidt et al. (1997) Development 
124: 2049-2062). After 7 days of culture, luciferin was removed from the cultures by 
extensive washing with P1-0 medium. 



Arabidoosis transformation with a SERK promoter f raqment/luciferase gene fusion 

Wildtype WS plants were grown under standard long day conditions: 16 hours light and 8 
hours dark. 

The first emerging influorescense was removed in order to increase the 

number of influorescenses. Five days later, plants were ready for vacuum infiltration. 

Agrobacterium strain C58C1 containing the transformation plasmid was grown on a LB 
plate with 50 mg/l kanamycin, 50 mg/l rifampicin and 25 mg/l gentamycin. A single colony 
was used to inoculate 500 ml of LB medium containing 50 mg/l kanamycin, 50 mg/l 
rifampicin and 25 mg/l gentamycin. The cultures were grown O/N at 28 degrees Celcius and 
the resulting log phase culture (OD600 0.8) was centrifuged to pellet the cells and 
resuspended in 150 ml of infiltration medium (0.5x MS medium (pH 5.7) with 5% sucrose 
and 10 pt/l benzyiaminopurine). The inflorescenses of 6 Arabidopsis plants are submerged 
in the infiltration suspension while he remaining parts of the plants (which are still potted) 
are placed upside down on meshed wire to avoid contact with the infiltration suspension. 

Vacuum is applied to the whole set-up for 10 min. at 50 kPa. Plants are directly afterwards 
placed under standard long day conditions. After completed seed setting the seeds were 
surface sterilized by a 1 % sodium hypochlorite soak, then thoroughly washed with sterile 
water and plated onto petridishes with 0.5xMS medium and 80 mg/l kanamycin in order to 
select for transformed seeds. After 5 days germination under long day conditions (10.000 
lux), the transformed seedlings could be identified by their green color of their cotyledons 
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(the untransformed seedlings turn yellow), and were further grown in soil under C1 lab 
conditions under long day conditions. This vacuum infiltration method resulted in 
approximately 0.1% transformed seeds. 

Transformation of a construct containing both a gene encoding kanamycin resistance and 
the 2200 bp (Hindlll / Dral) SERK genomic DNA fused to the firefly luciferase gene into 
Arabidopsis thafiana (WS) by vacuum infiltration resulted in six different kanamycin-resistent 
primary transformants (I, II, III, IV, V and VI). Plants IV and VI died at the seedling stage, 
although they were kanamycin resistant. A T2 generation could be obtained from the four 
plants I, II, III and V (Figure 4). Within the siliques of the T2 generation of plants no. Ill and 
V, an early inhibition in development could be observed in appoximatetely 25-50 % of the 
seeds. The plants I and II did not show a reduction in the number of developing 
seeds. (Figure 5). Similar results were observed in a T3 generation, in which again 
approximately 25-50% of the seeds showed an early inhibition of normal seed development. 

Arabidopsis transformation with a AtSERK gene 

Isolation of the AtSERK genomic and cDNA clones 

Using the DcSERK cDNA sequence (seq ID no. 2) as a probe, a lambda ZipLox genomic 
library made form Arabidopsis Landsberg erecta total genomic DNA is screened for the 
presence of homologous sequences. Three different lambda clones with inserts of 14, 18 
and 20 kb respectively are obtained. The 14 kb clone is digested by EcoRI and the resulting 
fragments subcloned into pBluescript vectors. Fragments spanning the entire coding 
sequence of the AtSERK gene are isolated, sequenced and compared with the Daucus 
homoiogues. The resulting sequence is shown as SEQ ID NO: 20. 

Using the DcSERK cDNA sequence (SEQ ID NO: 2) as a probe, a lambda ZAPII cDNA 
library is screened for the presence of homologous sequences. Four lambda clones are 
obtained and their inserts subcloned into pBluescript vectors using the helper phage 
excision procedure. Fragments spanning the entire AtSERK cDNA coding sequence of the 
AtSERK gene are isolated, sequenced and compared with the Daucus homoiogues. The 
resulting sequence is shown as SEQ ID NO: 32. 
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Plasmids containing promoter sequences 

Arabidopsis thaiiana LTP1 promoter fragment is obtained from the binary plasmid pUHIOOO 
(Thoma, S., Hecht, U M Kipper, A., Borella, J., De Vries, S.C., Sommerville, C. (1994) Plant 
Physiol. 105, 35-45) by digestion with BamM and H/ndlll and cloning into pBluescript SK" 
(pMT121). 

- The CaMV 35S promoter enhanced by duplication of the -343 to -90 region (Kay et a/. t 
(1987) Science 236: 1299-1302) is isolated from the pMON999 vector by digestion with 
H/ndlll and Sstl and cloned into the pBluescript SK" vector (pMT120). 

- The promoter AtDMCI (Klimyuk and Jones (1997) Plant Journal 1 1 : 1-14). 

Plasmid SLJ 9691 is a construct consisting of pBluescript SK+ in which the Arabidopsis 
thaiiana DMC1 genomic clone (accession number U76670) is cloned into the EcoRV site. 
SLJ 9691 carries EcoRV fragments of the 5' end of the AtDMCI gene with the following 
modification: a Bglll site instead of the second Hpal site, two ATG codons in the first exon 
and an Xhol site at the ATG codon of the second exon. 

- The FBP7 promoter from Petunia (Angenent et al. (1995) Plant Cell 7: 1569-1582). 

The promoter of the FBP7 gene is cloned by subcloning the 0.6 kb H/ndlll -Xbal genomic 
DNA fragment of FBP7 into the H/ndlll - Xbal site of pBluescript KS-, resulting in the vector 
FBP201. 

The pAtSERK binary vector constructs. 

Based on the pBIN 19 vector, a binary vector pAtSERK is constructed for transformation of 
the Arabidopsis thaiiana SERK cDNA under the control of different promoters. 

The full length Arabidopsis thaiiana cDNA clone of SERK (Seq ID No. NEW) is obtained 
from a pBluescript SK- plasmid. A Sma! - Kpnl 2.1 kb fragment containing the AtSERK 
cDNA is cloned into pBIN19 Smal - Kpnl. The polyadenylation sequence from the pea 
rbcS::E9 gene (Millar et al., 1992), Plant Cell 4: 1075-1087) is placed downstream from the 
AtSERK cDNA by cloning a Klenow-filled EcoRI - Hindlll E9 DNA fragment into the Klenow- 
filled Xmal site of the pBIN19:AtSERK vector in order to generate the binary vector 
pAtSERK. 
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Construction of plant expression vectors 

The pAtSERK binary vector is used to generate the following promoter-AtSERK constructs. 

- The AtLTPI promoter is cloned in the Smal site of the pAtSERK binary vector as a 
Klenow-filled Kpnl-Sstl DNA fragment to give the pAtLTPl AtSERK vector. 

- The CaMV 35S promoter is cloned in the Smal site of the pAtSERK binary vector as a 
Klenow-filled Kpnl-Ssfl fragment to give the p35SAtSERK vector. 

- The AtDMCI promoter consisting of the Bglll - Xhol 3.3kB fragment from the clone SLJ 
9691 is filled in with Klenow and cloned into the Smal site of the pAtSERK binary vector to 
give the pAtDMCl AtSERK vector. 

- A Sacl-Kpnl fragment of FBP2101 is filled in with Klenow and cloned into the Smal site of 
the pAtSERK binary vector to give the pFBP21 01 AtSERK vector. 

Introduction of plant expression vectors into Arabidopsis thaliana plant cells 

The above described vector constructs (pAtLTPl AtSERK, p35SAtSERK, pAtDMCl AtSERK, 
pFBP21 01 AtSERK) have been electrotransformed into Agrobacterium tumifacienses strain 
C58C1 as known in the art. 

Wild type Arabidopsis thaliana WS plants are grown under standard long day conditions: 16 
hours light and 8 hours dark. 

The first emerging inflorescence is removed in order to increase the 
number of influorescences. Five days later, plants are ready for vacuum infiltration. 
Agrobacterium strain C58C1 containing the transformation plasmid (the pAtLTPl AtSERK 
vector or the p35SAtSERK or the p AtDMC 1 AtS ER K vector or the pFBP21 01 AtSERK 
vector) is grown on a LB plate with 50 mg/l kanamycin, 50 mg/l rifampicin and 25 mg/l 
gentamycin. A single colony is used to inoculate 500 ml of LB medium containing 50 mg/l 
kanamycin, 50 mg/l rifampicin and 25 mg/l gentamycin. The cultures are grown O/N at 28 
degrees Celsius and the resulting log phase culture (OD600 0.8) is centrifuged to pellet the 
cells and resuspended in 150 ml of infiltration medium (0.5x MS medium (pH 5.7) with 5% 
sucrose and 1 mg/l benzylaminopurine). The inflorescences of 6 Arabidopsis plants are 
submerged in the infiltration suspension while the remaining parts of the plants (which are 
still potted) are placed upside down on meshed wire to avoid contact with the infiltration 
suspension. 
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Vacuum is applied to the whole set-up for 10 min. at 50 kPa. Plants are directly afterwards 
placed under standard long day conditions. After completed seed setting the seeds are 
surface sterilized by a 1 % sodium hypochlorite soak, then thoroughly ished with sterile 
water and plated onto petridishes with 0.5xMS medium and 80 mg/l kanamycin in order to 
select for transformed seeds. After 5 days germination under long day conditions (10.000 
lux), the transformed seedlings could be identified by their green colour of their cotyledons 
(the untransformed seedlings turn yellow), and are further grown in soil under long day 
conditions. This vacuum infiltration method resulted in approximately 0.1% transformed 
seeds. 

Expression of SERK sequences in Arabidopsis thaliana plant cells 

The inflorescences from transgenic and not transgenic Arabidopsis thaiiana plants are 
analysed by Whole mount in situ hybridisation analysis with AtSERK cDNA as probe. The 
inflorescences in different stages of development are fixed for 60 min. in PBS containing 70 
mM EGTA, 4% paraformaldehyde, 0.25% glutaraldehyde, 0.1% Tween 20, and 10% DMSO. 
Samples are then washed, treated with proteinase K for 10 min, again washed and fixed a 
second time. Hybridisation solution consisted of PBS containing 0.1% Tween 20, 330 mM NaCI, 
50 mg/ml heparin, and 50% deionized formamide. Hybridisation took place for 16 hours at 42°C 
using digoxigenin-labeled sense or antisense riboprobes (Boehringer Mannheim). After washing, 
the cells are treated with RNaseA and incubated with anti-digoxigenin-alkaline phosphatase 
conjugate (Boehringer Mannheim) which had been preabsorbed with a plant protein extract. 
Excess antibody is removed by washing followed by rinsing in staining buffer (100 mM Tris-HCI 
pH 9.5, 100 mM NaCI, 5 mM MgCfe, 1 mM levamisole) and the staining reaction is performed for 
16 hours in a buffer containing NBT and BCIP. Observations are performed using a Nikon 
Optiphot microscope equipped with Nomarski optics. 

The transformed plants show ectopic expression of SERK in the vicinity of the embryo sac. 
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SEQUENCE LISTIN3 



(1) GENERAL INFORMATION : 

(i) APPLICANT: 

(A) NAME: NOVARTIS AG 

(B) STREET: Schwarzwaldallee 215 

(C) CITY: Basel 

(E) COUNTRY: Switzerland 

(F) POSTAL CODE (ZIP) : 4058 

(G) TELEPHONE: +41 61 69 11 H 

(H) TELEFAX: + 41 61 696 79 76 

(I) TELEX: 962 991 

(ii) TITLE OF INVENTION: Iitprov^ts in or relating to organic 

cccrpounds 

(iii) NUMBER OF SEQUENCES: 33 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TOPE: Floppy disk 

(B) COMPUTER: IBM PC carrpatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentm Release #1.0, Version #1.25 (EPO) 



(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6695 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECNESS: double 

(D) TOPOLOGY: unknown 
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(ii) MOLECULE TYPE: ENA (genomic) 

(iii) HYTOTHETTICAL: NO 

(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Daucus carota 

(ix) FEATORE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 3696,. 6617 

(ix) FEATURE: 

(A) NAME/KEY: intron 

(B) LOCATION: 3731.. 3802 

(ix) FEATORE: 

(A) NAME/KEY: intron 

(B) LOCATION: 3851.. 3979 

(ix) FEATURE: 

(A) NAME/KEY: intron 

(B) LOCATION: 4124.. 4211 

(ix) FEATURE: 

(A) NAME/KEY: intron 

(B) LOCATION: 4284. .4357 

(ix) FEATORE: 

(A) NAME/KEY: intron 

(B) LOCATION: 4430.. 4528 

(ix) FEATORE: 

(A) NAME/KEY: intron 

(B) LOCATION: 4642.. 4757 
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(ix) FEATORE: 

(A) NAME/KEY: intron 

<B) LOCATION: 4890. .4967 

(ix) FEATORE: 

(A) NAME/KEV: intron 

(B) LOCATION: 5295.. 5803 

(ix) FEATURE: 

(A) NAME/KEY: intron 

(B) LOCATION: 6197.. 6339 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

TCTAGATGAC GAAATCGCGC TACCTTTGAT TITOGAAATAC TAGGTITCTAG TATCTTGATT 60 

AGTTTTTTGG ATATCTK3CT GTAATITCTr TAGGAGAIGC AAAOGGTCTT CATTTAATAT 120 

GAGCCCTTGT GACTTGACAA AAGTATCTAG CATUmUAT CACGAGGTTAG CTAAAAAGTA 180 

GCGTGTTTGA TTAAGCACAT AATATTGTAT TGGGCCTATT GGCTATCAAT GAAGTITCAT 240 

GCAAGTATAT AGCTICTATT ATOCATGTCA TGAQGGTATA TAAAAGAAGT AAAGAACATT 300 

CTCTCCTAGC ATTCATHTT CTCTTGCCTA TAGTITAACGA GTTTTGTCAC ACATGACGTT 360 

GAAACTGGAT c a u i iau r il' TTCCATCTAA GTTTGGATTA CCTGATAGAT GCICAACTTC 420 

TTOGTCAGCC TTTTCTTTOC GATTTITCCC AAGACAAGAT TOTTAGTITA ATAGTTATTC 480 

CTCTGGTGGC TTCICTGCAT TTIAGGAATC TTACTCTCTT TTTTAATGGA GAAACGAAAC 540 

CTACL-l'lTlT TICTGflOTIC CCTHTATGA TATCACCTGC TTCGAGGCGT TTAGACTTTA 600 

TCCACCTAAA CTATTCATCT TTACCAGACA AGCTATACGT TTTATCCCCC CCCCCCX3CGG 660 

ACCTGNGGAC AAAAGAAGCG CTGATCAACT GATITAATCC CTSTTTTATT ATA2TACACA 720 
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TTQATGCTTC ATGGAGCTAA TATCTTTGGT TAAA.TTTCAT CTATATATAT ACCCTTCCCT 780 

CTICTGATGG CAGTGGCCCC TCGTTTAATT AGCGTACTTA ATTATCTGAT GGATACTGTA 840 

TCCTTGGCAG ATGATCTCAT CAGATTATAC CATITGTICT GCTCTACAAA ATAAAAAACC 900 

TCTAITTATG TTCATCTTIT TCCTAACAAG TAACTAA1TC ATCCGCTATG TTGACAGGCG 960 

ATCCATTACA CAACTTAOGA ACTAGCTTCC AAGATCCCAA CAATGTCCTG CAGAGCTGGG 1020 

ATCCAACCCT TGTGAACCCT TCCACATCGT TTCATCTGAC ATCTAACAAT GAAAACAGTG 1080 

TTATAAGAGT GTAGGTCACT TCCCTTATTA ATTTTTITAG CAAGTTACGA ATATITACTC 1140 

AATTGAGCAG ATGTCTCTTT AAATATTTTT CTTTAATTIC TTAGCTAAGC GGAGCATCTA 1200 

TCTTAAGTAT CTC7TACTCAA TTTAAGACAT AATACATTTT TTTAAAAAAT CTA1TAGAGT 1260 

GTTTITTCCG CACAGCGCAC ATATATCTTT TTTCTGGTAA TTCAGACAAC CTITCTCCCG 1320 

ACGATAAAAT AATATAAGAT TAACTCCTTG AACTAATTTT TTATTTTTCT TTICTTTITA 1380 

TCTTCTTTGC AGAAAGTTTC TTATGGTCTT TTCTGAAAAG TACA1TCTAT GATAAT1T1T 1440 

TQGCAACTCA TATAAATTTA TATATATICC ATGTAGTITAT AAC7TTAAAAA AAGCTTCCTA 1500 

TTAATTCCAA GATAGAGGTT CATTTTTATA GTTTCGGCAT CCATGAGTITr 1TGAAAATGT 1560 

CAGAAATTTT GTTGAGTTAA TTITACTTAC CAACTTTTAT GGCGTCATGC AGTCATCTTG 1620 

GGAATGCAGC ATTATCTGCT CAA3TOGTTC OTCTTQQOCA GTTGAAAAAT TTACAAEACT 1680 

TGTAAGACCA TATCACITGG AATGCTTTAG THTIATACA GCACAATCCT TTCAATATCT 1740 

GTCTAAAACTG TGAAAAAGTT GACTTTCTAG CTTCAGCAGT TGTTCGGATA ATATCTATGA 1800 
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AGCACTTAAA AGGCTGGGCA ATlYITflCT TATTATTTCA. 
CITAATATGA TAAACTGATT TAACTCCTCA TGATTGGTCT 
AGTCACAINA TAAAATTQGN GGGTTGGACA AATATAACTT 
GAGCACTTAT CAACCTTGTC TAGCGCATAA CGTICACAGfTC 
GTTTQGGGAG GTTTTAATGA GCACTTATTT ACCTTGrTCTT 
1TAAAGTCTG CATCATTCAG AGfTCTAAAIT AGCACTTICA 
TGAAAGATAC ATATCTTAAT GITCCTATGC CTOTITCAAC 
TCTITGTCAT CTTAAAAATC GCACTGATTA AAATGTTGAGA 
TTTCATCTAT ACCAGAGAAT ATCATAATTT TTITAAATCA 
TCTCAGTATT GGTCTATTTA TATTITCCAC CATTTAGAAC 
TTGGACITCC ACAGAAGATC TTATAGTAAA AGrTATTCTIT 
TCATGGTGTTG GCXTIX3TCCCA GAATCTAAAT CAATCCCATC 
CTACTCACTG TTAATCGAAG AGTAACTATT TCTGAATTAA 
CATSCTTAGC GTTATAAAGG TCTACGTCTC ACTATGGTTT 
ACTGACAACT TTAAAGTTTC TCTTCTTTAC GAATTAAGAA 
AACTTTCICT CT33AAGGTC TTCTTAOCTT TTTATATATA 
GCTGGCAA1T ATATCTTACG AACTTACGAG TATACAGAAC 
AGTCGCICTA GTAGAACACC TTAAGCAAGA ACTTAATCAT 
TTCTTTITAG ATTTTTTCAA CTITATGGAA AATICTACCT 



PCT/EP97/02443 

AATATTCTTA ATT3TTACTA 1860 

CAGTCCAATG TGCCCTCATT 1920 

CTITICITAA GCTOCAGAAA 1980 

QGTCAGTICAC GGGCTATCCA 2040 

TTAAACGTCT GAGGATGfTTA 2100 

GTTGTA1TAT GAATGGTACA 2160 

ATGTTCTCTAA TAOTCTOTTA 2220 

AAGCTAGTCT TCCAATACCA 2280 

TAAGTTGQGC CTTAGAGTTT 2340 

TGTGTITGTCA GATGAAAATC 2400 

AGATCTGATG ATGAAAGTTC 2460 

TCACATGTTT GTIGATCTGA 2520 

ATCCTTITrr TTTICTTCTT 2580 

TTAACATCTTT ATAGTrTTCT 2640 

TATATAATAT AAAACGCTTT 2700 

TATATAGATA CTCAGACTCT 2760 

TTOTATATTA GGTTCAGATG 2820 

GAGCTTPCAA OTITITAACT 2880 

CATGATCXJPG GTTTCTTTOC 2940 
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ATAAACTTTC CATATAAGTC CXHTTCriTGA OTTTTTCATG TAAGCTCTTC ACGAGTGATT 3000 

ATTAGCGGTCT CTTTCAATAA TCATAATGTC TCTCACTTTG ATCAGGCCTG TACTTATTAT 3060 

TGCACCTTOC ACTTAACCTT GATCCTCATG TCATCTTGAT TOTCATACTC TACTAACCGA 3120 

GTTCAACATG GTTTATCATG TLT1T1GAGG TAACAATCTA GCTTTCACCT CTGTCCTTGA 3180 

TATAGGTTTA AGGCTTCCAC CTCCCACTAG CCTTIU3TT3 TITIATTCAC AGTTCACACA 3240 

CCTACTAGCA CT3TICACCT CTAGTCTTTT GTCOGCAAAT AGTAAGAAGT TTCTTICGCA 3300 

TAATAGTGGA TGATCATTTA AGAAATAGTTG AATCAAATTA TCCTGTITATT GTG1TICTAC 3360 

TTTGGAATTA AATGAGTTTGC TGAACATTGT TGCICTTTAT CCTTCTCAAG GCTITGCCAA 3420 

GGAAGGCGAT TAGTAAGAGT GGGCATCCAA GCGCCTTTAT CTTGAAGGGG CGGGCGGCAC 3480 

GTTGTGGAT7T CTGGGTGTTCT ATTAGAGGAC ATTATCTATA TATACTGATT ATTTATTAGA 3540 

ATATAAATCA ACTACTATAT TTITCTITCT AATGTTTATA TAGAAATCCC ACTCCTAAAC 3600 

TTGACAAATA CCATTGAAAT A1TTGAACCT AATTAA1TAG TAGTGTCAGG TTTAAATTCA 3660 

AACTCATTTA ATTTTACTTT AAAAAATAAT TCTATATGAA TCGTAACAGT ATAAATATAT 3720 

TAAATTACAT CTATCTGrTCC CTATATATAG CTGAATCTTCT AATAGACTCC AAGACGGCTG 3780 

CTCTIACTGC CTAGGCGTPCC AGGCACTTCA CTGATGCTTA CCTTGACAAA TATGGGCTTC 3840 

CTATCACATT GTTGGGGATC CCTATCACTG GATTCCTGTT TTGCTGACCC TCTGTTCAAT 3900 

TCATITTCAT TGATGTAGTA 1TACTAGTIT TATAAATATT CTITATTCCA ATAATTTAAC 3960 

TCGAGTTTAA CAA3X3ACAQG GAGCTTTACA GCAATAACAT AAGTCGACCA ATTCCTACTG 4020 
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ATCTTCGGAA TCTGACAAAT 1TGGTGAGCT TGGACCTATA CATSAATAGC TTCTCTGGAC 4080 



CTATACCGGA CACATTAGGA AAGCTTACAA GGCTAAGATT CTTCTATCAC TACAAATCTT 4140 



CACTAGTTIT TAACTTAATC CAATITGATT ATCCTTTCAA GTGMTGATT ATATCACAAA 4200 



TTACTGGATA GGCCTCTCAA CAAGAACTGC CTCTCTGGTC CAATTCCAAT GTCACTGACT 4260 



AATATTACAA CTCTTCAAGT CCTGTTAACTA TTCCGACCTT TCCAGATAGT TTICTTGTTG 4320 



TGGATGTITC AATTCTAATA CTAAATATGT TCATCAGGGA TTTATCAAAC AATCGGCTAT 4380 



CAGGACCACT ACCGGATAAT GGCTCATTTT OT1OTAC ACCTATCAGG TTTAATGCTA 4440 



CTAATATCTT TAATATTATG GTTCTTACTT CTACTGCGAA AGCTATCATA ATATITITIT 4500 



TCTCCTTCAT ATATTATCAC TTTCGCAGTr 1TGGCAATAA TTTGAAHTA TGTIGGACCTG 4560 



TAACTGGGAG GCCCTCCCCT GGATCTCCCC CATTTTCTCC ACCACCTCCG 1TCATCCCAC 4620 



CATCAACAGT ACAGCCTCCA GGTGATTTAG TTTITATA3T AATTCCCGTA ATTAATTTTA 4680 



TGACTGTAAA AATTGGTCTT AATTICACCA GTTCOGAATA AAGTATTTIC CTICTTTCTC 4740 



TICTTATTAT TATCAAGGAC AAAATOGTCC CACK3GAGCT 



GAGTAGCTCC 4800 



TGGTCCT3CT TTACT3TTK3 CIO^CCTOC AATCGCATTT GCATCGTCGC GGAGAAGAAA 4860 



ACCGCGAGAA CATTTCTTTG ATGTGCCAGG TTAGTCCTCT AAATAGATAT CTATTGAAGC 4920 



GCTTACTCTC ToTGGACTTT GTITICACTG TCAITACTTA ACTTCAGCTC AAGAGGACCC 4980 



AGAAGTGCAC CITOGTCAAC TCAAGAGCTT TTCTCTGCGA GAATTGCAAG TCGCAACQGA 5040 



TACTTITACT ACCATCCITC GAAGAGCTQG ATITCGTAAG GTCTATAAGG GACGCCTTCC 5100 



TGATGGCTCA CTTGTAGCAG TTAAAAGGCT TAAAGAAGAA CGAACACCAG GTGGCGAGCT 5160 
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GCftGTlTCAA ACAGAAGTQG AAATGATTAG CATGGCTGTG CATCGAAATC TICTGCGTCT 5220 

ACCTGCTTK: TGCATGACAC CTACCGAGCG GCTIdTGTA TATCCATACA TGGCTAATGG 5280 

AACTCTIGCG TCATGTITAA (3AGCTATCIC AGTIACAATT ACCATAACTT GCCAGAAGTT 5340 

TOXTIGMTA AAAATGAAAT ATAACTCCCT ACACTATGTT AAGGTGTrAT AATTTCTGAG 5400 

CAGATCTTAT TTCCCATTGC AAGATACCAG TTATTATTGT TmTCIGTA ATTGATACCG 5460 

CTTATATTIC TTTCTTGTAT TT3GTTATAT GCAAGGATTT CGAGTCTAAT AAGTTATCAA 5520 

ACTGGATGCT ATGTTTATTC TGCAATTGAA TICTTGCITC ATCTGCCAAA ATATATATGA 5580 

TTCAACITCG AATCATCTTA TAATATACTG TGTAAAGTCA GCTGTPGACT TTCATCATTA 5640 

ATTAGTCTTC ATAAATCAGA ATCTGCCTAG T3AGCTTTAC CGACATACTC TAAACCTTTC 5700 

TTATGGCCCT GTATATAATC GTCCCACTTA CTTTATTCAG TITGTCTGCT CTCTGAATIT 5760 

TTGATCTGTA CATTGTGATG TCTTGTTTTC ATCAAATGTA GAGCGTCAGC CATCAGAACC 5820 

TCCCCCIGAT TCGCCAACTA GGGAGAGGAT TGCACTAGGA TCTTCTAGGG GCCTATCTAA 5880 

ATTGCATGAC CATTGTGATC CCAAGATTAT CCATOGCGAT GTAAAAGCTG CAAATATATT 5940 

ATTGGACQAA GAATTTGAGG CTGTTGTAGG TGWTTTOGG TTAGCTAGGC TCATGGATTA 6000 

CAAGGATACC CATGTTAGGA CTGCTGTAAG GGGTTACCATT GGGCACATAG CTCCOGAGTA 6060 

CCTCTCGACT GGAAAGTCAT CAGAGAAGAC OGATGTCTTT GGTTATGGGA TAATCCTCCT 6120 

AGAGCICATT ACTGGACAGA GGGCTITIGA TCITGCTOGC CTK3GGAACG ATGATGATGT 6180 

TATGTTGTTG GATTGGGTAT GTGTCCCGGG lUlTULTl'lG GTTAATTATT TCACATATTA 6240 
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6695 



c^acta ciOT gcotticit riTATnccr gcctgtattt GAncrrAcrr 6 3 o o 

CATGrTTATGC ATATTGACCT GCITIGCAAT OICTITrAGG TIAAAAGCCT TTIGAAAGAG 6360 

AAAAACTTGG AGATGCTGCTT CGATCCTGAC CIGCAGAACA ATTACAITGA CACAGAAGTTT 6420 

GAGCAGCTTA TTCAAGTAGC ATTACTCTGfT ACCCAGGGTT OGCCAATGGA GCGGCCTAAG 6480 

ATOICAGAGG TACTCCGAAT GOTTGAAGGT GATGGCCTTG CAGAAAASIG GGACGAGTIGG 6540 

CAAAAACTTC AAGTCATCCA TCAAGACGTA GAATTAGCTC CACATCGAAC TICIGAATGG 6600 
ATCCTAGACT CGACAGATAA CITCCATGCT TTIGAATTAT CIGGVCCJ^ ATAAACAGCA 6660 
TATAAAATGT AATGAAATTA ATATTTTTTA TQGTIT 
(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTCH: 1815 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECNESS : single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cENA 

(iii) HYPOTHETICAL: NO 

(iii) AOTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Daucus carota 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 94.. 1752 
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(xi ) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
GACAAATACC ATTGAAATAT TTGAACCTAA TTAATTAGTA C^ICAGOTT TAAATTCAAA 

CTCATTTAAT TTTACTTTAA AAAATAATTC TAT ATG AAT CGT AAC ACT ATA AAT 

Met Asn Arg Asn Ser He Asn 



ATA TTA AAT TAC ATG CAG TTC ACT GAT GCT TAC CTT GAC AAA TAT GGG 
lie Leu Asn Tyr Met Gin Phe Thr Asp Ala Tyr Leu Asp Lys Tyr Gly 

GTT CTT ATG ACA TTG GAG CTT TAC AGC AAT AAC ATA AGT GGA CCA ATT 
Val Leu Met Thr Leu Glu Leu Tyr Ser Asn Asn lie Ser Gly Pro lie 

35 

CCT AGT GAT CTT GGG AAT CTG ACA AAT TTG GTG AGC TTG GAC CTA TAC 
Pro Ser Asp Leu Gly Asn Leu Thr Asn Leu Val Ser Leu Asp Leu Tyr 

40 

ATG AAT AGC TTC TCT GGA CCT ATA CCG GAC ACA TTA GGA AAG CTT ACA 
Met Asn Ser Phe Ser Gly Pro lie Pro Asp Thr Leu Gly Lys Leu Thr 
60 65 70 

AGG CTA AGA TTC TTG CGT CTC AAC AAC AAC AGC CTC TCT GGT OCA ATT 
Arg Leu Arg Phe Leu Arg Leu Asn Asn Asn Ser Leu Ser Gly Pro lie 
75 80 85 

CCA ATG TCA CTG ACT AAT ATT ACA ACT CTT CAA GTC CTG GAT TTA TCA 
Pro Met Ser Leu Thr Asn lie Thr Thr Leu Gin Val Leu Asp Leu Ser 
90 95 WO 

AAC AAT CQG CTA TCA GGA CCA GTA CCG GAT AAT GGC TCA TTT TCT TTG 
Asn Asn Arg Leu Ser Gly Pro Val Pro Asp Asn Gly Ser Phe Ser Leu 
105 HO 115 

TTT ACA CCT ATC AGT TTT GCC AAT AAT TTG AAT TTA TGT GGA CCC GTA 



60 



114 



162 



210 



258 



306 



354 



402 



450 



498 
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Phe Thr Pro He Ser Phe Ala Asn Asn Leu Asn Leu Cys Gly Pro Val 
120 

ACT GGG AGG CCC TGC CCT GGA TCT COC CCA TIT TCG CCA CCA CCT COG 
Gly Ar, Pxo Cys Pro Gly Ser Pro Pro Phe Ser Pro Pro Pre > Pxo 
140 1« 150 

TTC ATC CCA CCA TCA ACA GTA CAG CCT CCA GGA CAA AAT GGT CCC ACT 
Phe lie Pro Pro Ser Thr Val Gin Pro Pro Gly Gin Asn Gly Pro Thr 
155 I 60 165 

GGA GCT ATT GCT GGG GGA GTA GCT GCT GGT GOT GCT TTA CT3 TTT GCT 
Gly Ala lie Ala Gly Gly Val Ala Ala Gly Ala Ala Leu Leu Phe Ala 
170 

GCA CCT GCA ATG GCA TTT GCA TGG TGG CGG AGA AGA AAA CCG GGA GAA 
Ala Pro Ala Met Ala Phe Ala Trp Trp Arg Arg Arg Lys Pro Arg Glu 
185 19° 195 

CAT TTC TTT GAT GTG CCA GCT GAA GAG GAC CCA GAA GTG CAC CTT GGT 
His Phe Phe Asp Val Pro Ala Glu Glu Asp Pro Glu Val His Leu Gly 

,« 210 215 

200 205 

CAA CTG AAG AGG TTT TCT CTG CGA GAA TTG CAA GTC GCA AOG GAT ACT 
Gin Leu Lys Arg Phe Ser Leu Arg Glu Leu Gin Val Ala Thr Asp Thr 

09 c 230 
220 22b 

TTT AGT ACC ATA CTT GGA AGA GGT GGA TTT GGT AAG GTG TAT AAG GGA 
Phe Ser Thr lie Leu Gly Arg Gly Gly Phe Gly Lys Val Tyr Lys Gly 
235 240 245 

CGC CTT GCT GAT GGC TCA CTT GTA GCA GTT AAA AGG CTT AAA GAA GAA 
Arg Leu Ala Asp Gly Ser Leu Val Ala Val Lys Arg Leu Lys Glu Glu 
250 255 260 

CGA ACA CCA GGT GGT GAG CTG CAG TTT CAA ACA GAG GTG GAA ATG ATT 
Arg Thr Pro Gly Gly Glu Leu Gin Phe Gin Thr Glu Val Glu Met He 



546 



594 



642 



690 



738 



786 



834 



882 



930 
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265 



270 275 



AGC ATG GCT GTG CAT CGA AAT CTT CTG CGT CTA CGT GGT TTC TGC ATG 978 
Ser Met Ala Val His Arg Asn Leu Leu Arg Leu Arg Gly Phe Cys Met 
280 285 290 295 

ACA OCA ACA GAG CGG CTT CTT GTA TAT CCA TAC ATG GCT AAT GGA AGT 1026 
Thr Pro Thr Glu Arg Leu Leu Val Tyr Pro Tyr Met Ala Asn Gly Ser 
300 305 310 

GTT GCG TOG TGT TTA AGA GAG CGT CAG CCA TCA GAA CCT CCC CTT GAT 1074 
Val Ala Ser Cys Leu Arg Glu Arg Gin Pro Ser Glu Pro Pro Leu Asp 
315 320 325 

TGG CCA ACT AGG AAG AGG ATT GCA CTA GGA TCT GCT AGG GGG CTT TCT 1122 
Trp Pro Thr Arg Lys Arg lie Ala Leu Gly Ser Ala Arg Gly Leu Ser 
330 335 340 

TAT TTG CAT GAC CAT TGT GAT CCC AAG ATT ATC CAT CGT GAT GTA AAA 1170 
Tyr Leu His Asp His Cys Asp Pro Lys He He His Arg Asp Val Lys 
345 350 355 

GCT GCA AAT ATA TTA TTG GAC GAA GAA TTT GAG GCT GTT GTA GGT GAT 1218 
Ala Ala Asn He Leu Leu Asp Glu Glu Phe Glu Ala Val Val Gly Asp 
360 365 370 375 

TTT GGG TTA GCT AGG CTC ATG GAT TAC AAG GAT ACC CAT GTT ACA ACT 1266 
Phe Gly Leu Ala Arg Leu Met Asp Tyr Lys Asp Thr His Val Thr Thr 
380 385 390 

GCT GTA AGG GGT ACC TTG GGC TAC ATA GCT CCC GAG TAC CTC TOG ACT 1314 
Ala Val Arg Gly Thr Leu Gly Tyr He Ala Pro Glu Tyr Leu Ser Thr 
395 400 405 

GGA AAG TCA TCA GAG AAG ACC GAT GTC TTT GGT TAT GGG ATT ATG CTC 1362 
Gly Lys Ser Ser Glu Lys Thr Asp Val Phe Gly Tyr Gly He Met Leu 
410 415 420 
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TTA GAG CTC ATT ACT GGA CAG AGA GCT TTT GAT CTT GCT CGC CTT GCG 1410 
Leu Glu Leu He Thr Gly Gin Arg Ala Fhe Asp Leu Ala Arg Leu Ala 
425 430 «5 

AAC GAT GAT GAT GTT ATG TTG TTG GAT TGG GTT AAA AGC CTT TTG AAA 1458 
Asn Asp Asp Asp Val Met Leu Leu Asp Tip Val Lys Ser Leu Leu Lys 
440 445 450 455 

GAG AAA AAG TTG GAG ATG CTG GTC GAT CCT GAC CTG GAG AAC AAT TAC 1506 
Glu Lys Lys Leu Glu Met Leu Val Asp Pro Asp Leu Glu Asn Asn Tyr 
460 465 470 

ATT GAC ACA GAA GTT GAG CAG CTT ATT CAA GTA GCA TTA CTC TGT ACC 1554 
He Asp Thr Glu Val Glu Gin Leu He Gin Val Ala Leu Leu Cys Thr 
475 480 485 

CAG GGT TCG CCA ATG GAG CGG CCT AAG ATG TCA GAG GTA GTC CGA ATG 1602 
Gin Gly Ser Pro Met Glu Arg Pro Lys Met Ser Glu Val Val Arg Met 
490 495 500 

CTT GAA GGT GAT GGC CTT GCA GAA AAG TGG GAC GAG TGG CAA AAA GTA 1650 
Leu Glu Gly Asp Gly Leu Ala Glu Lys Trp Asp Glu Trp Gin Lys Val 
505 510 515 

GAA GTC ATC CAT CAA GAC GTA GAA TTA GCT CCA CAT CGA ACT TCT GAA 1698 
Glu Val He His Gin Asp Val Glu Leu Ala Pro His Arg Thr Ser Glu 
520 525 530 535 

TGG ATC CTA GAC TCG ACA GAT AAC TTG CAT GCT TTT GAA TTA TCT GGT 1746 
Trp lie Leu Asp Ser Thr Asp Asn Leu His Ala Phe Glu Leu Ser Gly 
540 545 550 

CCA AGA TAAACAGCAT ATAAAATGTG AATGAAATTA ATATTTTTTA TGGTTAAAAA 1802 
Pro Arg 
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AAAAAAAAAA AAA 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 553 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Met Asn Arg Asn Ser He Asn He Leu Asn Tyr Met Gin Phe Thr Asp 
! 5 10 15 

Ala Tyr Leu Asp Lys Tyr Gly Val Leu Met Thr Leu Glu Leu Tyr Ser 
20 25 30 

Asn Asn He Ser Gly Pro He Pro Ser Asp Leu Gly Asn Leu Thr Asn 
35 40 45 

Leu Val Ser Leu Asp Leu Tyr Met Asn Ser Phe Ser Gly Pro He Pro 
50 55 60 

Asp Thr Leu Gly Lys Leu Thr Arg Leu Arg Phe Leu Arg Leu Asn Asn 
65 70 75 80 

Asn Ser Leu Ser Gly Pro He Pro Met Ser Leu Thr Asn He Thr Thr 
85 90 95 

Leu Gin Val Leu Asp Leu Ser Asn Asn Arg Leu Ser Gly Pro Val Pro 
100 105 H° 

Asp Asn Gly Ser Phe Ser Leu Phe Thr Pro He Ser Phe Ala Asn Asn 
115 120 125 



1815 
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Leu Asn Leu Cys Gly Pro Val Thr Gly Arg Pro Cys Pro Gly Ser Pro 
130 135 I 40 



Pro Phe Ser Pro Pro Pro Pro Phe He Pro Pro Ser Thr Val Gin Pro 
145 150 155 160 



Pro Gly Gin Asn Gly Pro Thr Gly Ala He Ala Gly Gly Val Ala Ala 
165 170 175 

Gly Ala Ala Leu Leu Phe Ala Ala Pro Ala Met Ala Phe Ala Trp Trp 
180 185 190 

Arg Arg Arg Lys Pro Arg Glu His Phe Phe Asp Val Pro Ala Glu Glu 
195 200 205 

Asp Pro Glu Val His Leu Gly Gin Leu Lys Arg Phe Ser Leu Arg Glu 
210 215 220 

Leu Gin Val Ala Thr Asp Thr Phe Ser Thr He Leu Gly Arg Gly Gly 
225 230 235 240 

Phe Gly Lys Val Tyr Lys Gly Arg Leu Ala Asp Gly Ser Leu Val Ala 
245 250 255 

Val Lys Arg Leu Lys Glu Glu Arg Thr Pro Gly Gly Glu Leu Gin Phe 
260 265 270 

Gin Thr Glu Val Glu Met He Ser Met Ala Val His Arg Asn Leu Leu 
275 280 285 

Arg Leu Arg Gly Phe Cys Met Thr Pro Thr Glu Arg Leu Leu Val Tyr 
290 295 300 

Pro Tyr Met Ala Asn Gly Ser Val Ala Ser Cys Leu Arg Glu Arg Gin 
305 310 315 320 



Pro Ser Glu Pro Pro lieu Asp Trp Pro Thr Arg Lys Arg He Ala Leu 
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325 



330 



335 



Gly ser Ala Arg Gly Leu Ser Tyr Leu His Asp His Cys Asp Pro Lys 

345 350 



340 



lie He His Arg Asp Val Lys Ala Ala Asn lie Leu Leu Asp Glu Glu 
355 360 365 

Phe Glu Ala Val Val Gly Asp Phe Gly Leu Ala Arg Leu Met Asp Tyr 
370 375 380 

Lys Asp 
385 



Thr His Val Thr Thr Ala Val Arg Gly Thr Leu Gly Tyr He 
390 395 400 



Ala Pro Glu Tyr Leu Ser Thr Gly Lys Ser Ser Glu Lys Thr Asp Val 

410 415 



405 



Phe Gly Tyr Gly He Met Leu Leu Glu Leu He Thr Gly Glu Arg Ala 

425 430 



420 



Phe Asp Leu Ala Arg Leu Ala Asn Asp Asp Asp Val Met Leu Leu Asp 
435 440 445 

Trp Val Lys Ser Leu Leu Lys Glu Lys Lys Leu Glu Met Leu Val Asp 
450 455 460 



Pro Asp Leu Glu Asn Asn Tyr He Asp Thr Glu Val Glu Gin Leu He 

475 480 



465 



470 



Gin Val Ala Leu Leu Cys Thr Gin Gly Ser Pro Met Glu Arg Pro Lys 
485 490 495 

Met Ser Glu Val Val Arg Met Leu Glu Gly Asp Gly Leu Ala Glu Lys 
500 505 510 



Trp Asp Glu Trp Gin Lys Val Glu Val He His Gin Asp Val Glu Leu 
515 520 525 
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Ala Pro 
530 



His Arg to Ser Glu Trp lie Leu Asp Ser <Thr Asp Asn l^u 



535 540 



His Ala Phe Glu Leu Ser Gly Pro Arg 
545 550 

(2) INFORMATION FOR SEQ ID NO: 4: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEXNESS : single 

(D) TOPOLOGY: unknown 

(iii) HYPOIHETTICAL: NO 

(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: primer 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
TTlTlTl ' lTf TGC 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDELNESS : single 

(D) TOPOLOGY: unknown 

(iii) HYPOTHETICAL: NO 



13 



(iii) ANTI-SENSE: NO 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: primer 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5 
GGGATCTAAG 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: single 

(D) TOPOLOGY: unknown 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: primer 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
ACACGTGCTC 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEXNESS: single 

(D) TOPOLOGY: unknown 

(iii) HYPOTHETICAL: NO 
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(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: primer 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7 
TCAGCACAGG 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: single 

(D) TOPOLOGY: unknown 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: primer 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 
riTlTlTlTr TCTG 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: single 

(D) TOPOLOGY: unknown 

(iii) HYPOTHETICAL: NO 
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(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: primer 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
'ITlTlTriTl' TCA 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECKESS : single 

(D) TOPOLOGY: unknown 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: primer 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
GACATCGTCC 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDELMESS : single 

(D) TOPOLOGY: unknown 
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(iii) HYPOTHETICAL: NO 

(iii) ANIT -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: primer 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

10 

CCCTACTGGT 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEHSESS : single 

(D) TOPOLOGY: unknown 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: primer 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

10 

ACACGTGGTC 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECNESS: single 

(D) TOPOLOGY: unknown 
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(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: primer 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
GGTGACTGTC 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENCTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECNESS : single 

(D) TOPOLOGY: unknown 

(iii) HYPOTHETICAL: NO 

(iii) AOTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: primer 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

TCTTGGACCA GATAATTC 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDETNESS : single 
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(D) TOPOLOGY: unknown 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: primer 

(xi) SEQUENCE DESCRIPTION SEQ ID NO: 15: 

19 

CTCTGATGAC TTTCCAGTC 
(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: single 

(D) TOPOLOGY: unknown 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: primer 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 16: 

16 

AATGGCATTT GCATGG 

(2) INFORMATION FOR SEQ ID NO: 17: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENCTH: 5 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEEXSESS : single 

(D) TOPOIjOGY: unknown 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Daucus carota 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



Ser Pro Pro Pro Pro 
1 5 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDECNESS : single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Daucus carota 



(xi) 



SEQUENCE DESCRIPTION: SEQ XD NO: 18: 



His Arg Asp Val Lys Ala Ala Asn 



WO 97/43427 

-62- 



PCT/EP97/02443 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) SIRANDEXNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Daucus carota 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Gly Thr Leu Gly Tyr He Ala Pro Glu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4081 base pairs 

(B) TYPE: nucleic acid 

(C) SIRANDECNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: ENA (genomic) 
(iii) HYPOTHETICAL: NO 



(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: Arabidqpsis thaliana 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: Arabidopsis SERK gene 

(ix) FEATURE : 

(A) NAME/KEY: axon 

(B) LOCATION: 1280.. 1367 

(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 1796.. 1928 

(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 2014.. 2085 

(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 2203.. 2346 

(ix) FEATURE: 

(A) NAME /KEY: exon 

(B) LOCATION: 2450.. 2521 

(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 2617.. 2688 

(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 2772.-2884 

(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 3015.. 3146 
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(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 3305.. 3646 

(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 3760.. 4081 



(xi) SEQUENCE DESCRIPTION: SEQ 3D NO: 20: 

TCTAGAAACC TTTTGATCAT AATGAAAATA AAGAGTCCAT CCACCACATG GGCTAAGCAT 60 

AATGTGTGAT ATTTAAAGGG TAACAAATGT AATCTGCTTT TTATTTTACT TTTTACCTCT 120 

ACICAAATTG TATGGGCACT TITITITI ' IT TTTTAAATGA TAAGACAAGT ATCTGTTTAA 180 

TGGTATTGTG ATGAAACACT AGTAAAGTCA TATCGGGCAC GCCATACTAC TTCCACAGTG 240 

GAACTIGGCC AAATTTTGTC TTT3CCCTCT CTACAGTTTC TTCCACCAAA -m-mwriG 300 

ACAAAACTCA AATCTTTCAA TCTCATCTCT GCCAAAGTTG GGTTTAGAAA GAATATCAGC 360 

AAACACTAAT ATCTTTATTG TTGCATGGTT TATCAATCAC AAAATTCACA ACCATTGTAA 420 

AAAAAAATTC ACATTTTTGG TATGAGATTG CTCACATGAT AGTGAACCTC TTTAACATTT 480 

TAACTTTACT TTCATAAATA CGGGATTACG AATCTTACTT GCATTAAAAA TTTAGAAAAG 540 

GTTTTTCTAC TTAAAGAAAA AAGGGACCCA ACAGAGAGAG GTTTGACCAG GAGAAACGGG 600 

TGCATAGCCT TAAGAGCTTT CAACTACTTT ACCCCAAACC CAAAGCGATG TCACTTTCAA 660 

CCATCTCTTC TCTCCCCCGA ACCCGTTTIT TTGACCGGTC AGTTCGGGCA GCAGCACCGT 720 

TACGGGCAGC TTATATTCCT OGTCTTCCTC CTCTACACCA CTGCATGCCC ATAAATAAAG 780 
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CCCGTTGAGA TCTTTAAAAA TATTAAATAA TATATCAACG AAAAAGCTAT TTTATTCATA 840 



AGAAGAAAAA GAGAGGAACA ACAACAACAC ACTAATCATA GTITCTCTGG CAGGCTTGTT 900 



CaTTGCGGCTT AATAAAAAGC 



TTATTACTTC ACGTAGATTT TCCCCAAAAA 960 



GCTCTTATIT TTTIGTITAA AAAAAAAAGT TTCATCTTTA TTCAACTTTT GTTITACAGT 1020 



GTGTTGTCTGA GAGAGAGACT CTGCTTTGAT TGAGGAAAGA CGACGACGAG AACGCCGGAG 1080 



AATTAGGATT TTTATTTTAT TTTTTACTCT TTGTTTSTTT TAATGCTAAT GGCJ1T1TIAA 1140 



AAGGGOTATC GAAAAAATGA GTCAGTIT3T GTTGAGG1TC TCTCTGTAAA C7K3TTAATQG 1200 



TGGTCATTIT CGGAAG2TAG 



GATCTGAAGA GATCAAATCA AGATTCGAAA 1260 



1TTAGCA1TG TIGTTIGAAA TGGAGTCGAG TTATC3X3GTC TTTATCTTAC TTICACTGAT 1320 



CTTACTTCCG AATCATTCAC TGTGGCTTGC TTCTGCTAAT TTGGAAGGTT CGTGGTTACT 1380 



CAATTACTCA GCTTTACTCG TTTCTCAATT ACTTTCTCGA 



rA TTTGGAGGrlG 1440 



AATCGCTATC TTTAGTGTCT GCATTTTGAT 7TATCAAAAT TGTIGTTGTT CTTIGTATIT 1500 



GTAAGATTTA GTTGGCTAGTA CTTTGAATAC ACTGTnTIGC TTITCTTCTT CAGATCAACT 1560 



TTCTATATIG TAAAGGCATG I ' ll ' lTlU OST TGAAAAGCIG GGITATITGA TATCTTAAGA 1620 



TTGATGTIGT TGATCCAAAC ATTCTCTGAA AGACTICATT 



TTTGTAAAGA 1680 



A3TTGTITAA TTATTAGCCT CTAATCTCAG AGAGGCCICT TTGAATAGTIT CTCTCTTGAA 1740 



ATTAGACTIT ICACCAATrG ATGCTAATTG TCTAGA1TTC 1TCTTCTICT TATAGGTGAT 1800 



GCTTIGCATA CITIGAGGCT TACTCTAGTT GATCCAAACA ATGTCTTGCA GAGCIGGGAT 1860 



CCTACGCTAG TGAATCCTTG CACATGGTTC CATCTCACTT GCAACAACGA GAACAGTCTC 1920 
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ATAAGAGTGT AAAGCTITCT TCTACTAATC CCACTJTTTA AACTITCACC TCAGCGTTGGT 1980 

TACCGACATT mU ' lTlLTl ' TT3TCAAATA CAGTCATCTC GGGAATCCAG AGTTATCTGG 2040 

CCATTTAGTT CCAGAGCTTG CTCTGCTCAA GAATTTGCAG TMTICTAAG TOCCACTIAT 2100 

GCATCATGCT TTAACAAAAC AAATCCAAGA TITGACAGAA GAAGCACTGG AGTTACCTTT 2160 

TCTAA1TCAA A'lLTlTlTAA CAAGTTTCTT ATTITCTTAC AGGGAGCTIT ACAGTAACAA 2220 

CATAACTGGC CCX^ATTCCTA GTAATCTTGG AAATCTGACA AACTTAGTCA GTTTGGATCT 2280 

TTACTTAAAC AGCTTCTCCG CTCCTATPCC GGAATCA1TC GGAAAGCTTT CAAAGCTGAG 2340 

ATTTCTCTSA GTATACATAT GCTTTACOGG CTCAGTTACA GTCTITCTIT AATCTTAGGT 2400 

TTTOITCCAA TTTTTGACTC TETGCTGAAA ATTTTACATG CAAGAATAGC CGGCITAACA 2460 

ACAACAGTCT CACTGGGTCA ATTCCTATCT C^CTGACCAA TATTACTACC CTTCAAGTGT 2520 

TGTGAGTCCT CTCATTAACT TTCATTTATG TCTACTTCAT TCTCCCTCAG TTCATTPGTT 2580 

GACTTAATCC ACTTAACCTT GATQGATGCA ACACAGAGAT CTATCAAATA ACAGACTCTC 2640 

TGCTTTCAGtlT CCTGACAATG GCTCCTTCTC ACTCTTCACA CCCATCAGGT TCTATGATTT 2700 

ATCCTCTTCA GTTATTTCAG TTGTTGTGTC AGTCTCTGAA CTTATTCTCA AACTTTtATT 2760 

TCCTTGTCCA GTTTTIGCTAA TAACITAGAC CTATGTGGAC CT3TTACAAG TCACCCATOT 2820 

CCTGGATCTC CCCLVITITC TCCTCCACCA CLTITIATTC AACCTCCCCC AGTITCCACC 2880 

CCGAGTAAGC CTCCTCTTTT TAGrTTTACAT TATAGGAAAC AGAAGATGAA ATCTTTGCIT 2940 

CTCTGTCAAT CCTTITICIC ATATAACTCA TCTTGCCAAT AAGGCAATAA CCAAATGATC 3000 
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TAATTTGATT TCAGGTGGGT ATGGTATAAC TGGAGCAATA GCTGGTIGGAG TISCIGCAGG 3060 

^CTTTG CITITK^ CTCCIGCAAT AGCCFITCCr TGCTGGCGAC GAAGAAAGCC 3120 

ACTAGATATT TTCTTCGATG TGCCIGGTCA GITTATITOT CGCATTAGTr TCTOTTCTrA 3180 

GCCAGCAATT TTCTITK3CA GAAAAGTATT GGAACAACTG TIAATGAAAA TCAATACATA 3240 

AGKMTOIT ttitaagtia caaactctit tcactaaaat CTCGATTGCA AAATCTCTAT 3300 

GCAGCCGAAG AAGATCCAGA ACTTCATCK3 GGACAGCICA AGAGGTTITC TTIGCGGGAG 3360 

CTACAAGTGG CGAGTGATGG OTTTACTAAC AAGAACATIT TGGGCAGAGG TGGCTTIGGG 3420 

AAAGIOTAC A AGGGACGCTT GGCAGACGGA. ACTCTTGTTG CICICAAGAG ACISAAGGAA 3480 

GAGCGAACTC CAGGTGGAGA GCICCAGTIT CAAACAGAAG TAGAGATGAT AAGTATGGCA 3540 

GTTCATCGAA ACCIOTIUAG ATTACGAGGT TiraiATCA CACCGACCGA GAGATIGCIT 3600 

GTGTATCCIT ACATGGCCAA TQGAAGTGTT GCTTCGTGTC TCAGAC3GTAA AAACTAAACA 3660 

ATTAAACATC TTCIGCICIC TCTCAATTAC TTIUACGriGA AGICITITrr CATGTTITCC 3720 

TTTAT3GGTT CATAATIdT GGrTTACACTA ATGACACAGA GAGGCCACCG TCACAACCTC 3780 

CGCnGATTO GCCAACGCGG AAGAGAATCG OGCTAGGCTC AGCTCGAGCT TTGTGTTACC 3840 

TACA.TGATCA CTGCGATCCG AAGATCATTC ACOGTISACGT AAAAGCAGCA AACA.TCCTCT 3900 

TAGACGAAGA ATTCGAAGCG GITCITOGAG ATTTOGGGflT GGCAAAGCTA AT3GACTATA 3960 

AAGACACTCA CGTGACAACA GCAGTCCGTG GCACCATCGG TCACATOGCT CCAGAATATC 4020 

TCTCAACCGG AAAATCTTCA. GAGAAAACCG ACCnTTCGG ATACGGAATC ATGCTTCTAG 4080 

4081 
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INFORMATICSN FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENG7IH: 494 amino acids 

(B) TYPE: amino acid 

(C) STRANDECNESS : unknown 

(D) TOPOLOGY: linear 

(ii) MDLECULE TOPE: protein 
(iii) HYPOTHETICAL: NO 

(v) FRAGMENT TYPE: N- terminal 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 21: 

Met Glu Ser Ser Tyr Val Val Phe He Leu Leu Ser Leu He Leu Leu 
15 10 15 

Pro Asn His Ser Leu Trp Leu Ala Ser Ala Asn Leu Glu Gly Asp Ala 
20 25 30 

Leu His Thr Leu Arg Val Thr Leu Val Asp Pro Asn Asn Val Leu Gin 
35 40 45 

Ser Trp Asp Pro Thr Leu Val Asn Pro Cys Thr Trp Phe His Val Thr 
50 55 60 

Cys Asn Asn Glu Asn Ser Val He Arg Val Asp Leu Gly Asn Ala Glu 
65 70 75 80 

Leu Ser Gly His Leu Val Pro Glu Leu Gly Val Leu Lys Asn Leu Gin 
85 90 95 
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Glu Leu Tyr Ser Asn Asn He Thr Gly Pro He Pro Ser Asn Leu Gly 

105 HO 



100 



Asn Leu Thr Asn Leu Val Ser Leu Asp Leu Tyr Leu Asn Ser Phe Ser 
115 120 I 25 



Gly Pro He Pro Glu Ser Leu Gly Lys Leu Ser Lys Leu Arg Phe Leu 

135 140 



Arg 
145 



130 



Leu Asn Asn Asn Ser Leu Thr Gly Ser He Pro Met Ser Leu Thr 
150 155 I 60 



Asn lie Thr Thr Leu Gin Val Leu Asp Leu Ser Asn Asn Arg Leu Ser 
165 170 175 



Gly Ser Val Pro Asp Asn Gly Ser Phe Ser Leu Phe Thr Pro He Ser 

185 190 



180 



Phe Ala Asn Asn Leu Asp Leu Cys Gly Pro Val Thr Ser His Pro Cys 
195 200 205 



Pro Gly Ser Pro Pro Phe Ser Pro Pro Pro Pro Phe He Gin Pro Pro 



Gly 
210 



215 



220 



Pro Val Ser Thr Pro Ser Gly Tyr Gly He Thr Gly Ala He Ala Gly 
225 230 235 240 

Gly Val Ala Ala Gly Ala Ala Leu Leu Phe Ala Ala Pro Ala He Ala 
245 250 255 

Phe Ala Trp Trp Arg Arg Arg Lys Pro Leu Asp He Phe Phe Asp Val 
260 265 270 



Pro Ala Glu Glu Asp Pro Glu Val His Leu Gly Gin Leu Lys Arg Phe 
275 280 285 

Ser Leu Arg Glu Leu Gin Val Ala Ser Asp Gly Phe Ser Asn Lys Asn 
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290 



295 



300 



He Leu Gly Arg Gly Gly Phe Gly Lys Val Tyr Lys Gly Arg Leu Ala 
305 310 315 320 

Asp Gly Thr Leu Val Ala Val Lys Arg Leu Lys Glu Glu Arg Thr Pro 
325 330 335 

Gly Gly Glu Leu Gin Phe Gin Thr Glu Val Glu Met He Ser Met Ala 
340 345 350 

Val His Arg Asn Leu Leu Arg Leu Arg Gly Phe Cys Met Thr Pro Thr 
355 360 365 

Glu Arg Leu Leu Val Tyr Pro Tyr Met Ala Asn Gly Ser Val Ala Ser 
370 375 380 

Cys Leu Arg Glu Arg Pro Pro Ser Gin Pro Pro Leu Asp Trp Pro Thr 
385 390 395 400 

Arg Lys Arg He Ala Leu Gly Ser Ala Arg Gly Leu Ser Tyr Leu His 
405 410 415 

Asp His Cys Asp Pro Lys He He His Arg Asp Val Lys Ala Ala Asn 
420 425 430 

He Leu Leu Asp Glu Glu Phe Glu Ala Val Val Gly Asp Phe Gly Leu 
435 440 445 

Ala Lys Leu Met Asp Tyr Lys Asp Thr His Val Thr Thr Ala Val Arg 
450 455 460 



Gly Thr He Gly His He Ala Pro Glu Tyr Leu Ser Thr Gly Lys Ser 
465 470 475 480 



Ser Glu Lys Thr Asp Val Phe Gly Tyr Gly He Met Leu Leu 
485 490 
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(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1106 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to iriRNA 
(iii) HYPOTHETICAL: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 142.. 795 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 22: 

TCGACCCACG CGTCCGTCCA ACTTCAATAA AGGGGAAACC AACCTAACCC TAATTTTGCT 60 

TICICCTCTT TGTTCAGAAA ATTTTCCCTT TACTCTCAAA TTCCTTTTCG ATITCCCTCT 120 

CTTAAACCIC CGAAAGCTCA C ATG GCG TOT CGA AAC TAT CGG TGG GAG CTC 171 

Met Ala Ser Arg Asn Tyr Arg Trp Glu Leu 
15 10 

TTC GCA GCT TCG TTA ACC OTA ACC TTA GCT TIG ATT CAC CTG GTC GAA 219 
Phe Ala Ala Ser Leu Tnr Leu Thr Leu Ala Leu He His Leu Val Glu 
15 20 25 



GCA AAC TCC GAA GGA GAT GCT CTC TAC GCT CTT CGC CGG ACT TTG ACA 
Ala Asn Ser Glu Gly Asp Ala Leu Tyr Ala Leu Arg Arg Ser Leu Thr 
30 35 40 



267 



WO 97/43427 



-72- 



PCT/EP97/02443 



GAT CCA. GAC CAT GTC CTC CAG AGC TGG GAT CCA ACT CTT GIT AAT CCT 
Asp Pro Asp His Val Leu Gin Ser Trp Asp Pro Thr Leu Val Asn Pro 
45 50 55 



315 



TGT ACC TGG TTC CAT GTC ACC TGT AAC CAA GAC AAC CGC GTC ACT CGT 
Cys Thr Trp Phe His Val Thr Cys Asn Gin Asp Asn Arg Val Thr Arg 
60 65 70 



363 



GTG GAT TTG GGA AAT TCA AAC CTC TCT GGA CAT CTT GCG CCT GAG CTT 
Val Asp Leu Gly Asn Ser Asn Leu Ser Gly His Leu Ala Pro Glu Leu 
75 80 85 90 



411 



GGG AAG CTT GAA CAT TTA CAG TAT CTA GAG CTC TAG AAA AAC AAC ATC 
Gly Lys Leu Glu His Leu Gin Tyr Leu Glu Leu Tyr Lys Asn Asn He 
95 100 105 



459 



CAA GGA ACT ATA OCT TCC GAA CTT GGA AAT CTG AAG AAT CTC ATC AGC 
Gin Gly Thr He Pro Ser Glu Leu Gly Asn Leu Lys Asn Leu He Ser 
HO 115 120 



507 



TTG GAT CTG TAG AAC AAC AAT CTT ACA GGG ATA GTT CCC ACT TTC TTG 
Leu Asp Leu Tyr Asn Asn Asn Leu Thr Gly He Val Pro Thr Phe Leu 
125 130 135 



555 



GGA AAA TTG AAG TCT CTG GTC TTT TTA CGG CTT AAT GAC AAC CGA TTG 
Gly Lys Leu Lys Ser Leu Val Phe Leu Arg Leu Asn Asp Asn Arg Leu 
140 145 150 



603 



ACC GGT CCA ATC CTA GAG CAC TCA CGG CAA TCC CAA GCC TTT AAA GTT 
Thr Gly Pro He Leu Glu His Ser Arg Gin Ser Gin Ala Phe Lys Val 
155 160 165 170 



651 



GTT GAC GTC TCA AGC AAT GAT TTG TGT GGG ACA ATC CCA ACA AAC GGA 
Val Asp Val Ser Ser Asn Asp Leu Cys Gly Thr lie Pro Thr Asn Gly 
175 180 185 



699 



CCC TTT GCT CAC ATT CCT TTA CAG AAC TTT GAG AAC AAC COG AGA TTG 



747 
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Pro Phe Ala His lie Pro Leu Gin Asn Phe Glu Asn Asn Pro Arg Leu 
190 195 200 

GAG GGA CCG GAA TTA CTC GOT CTT GCA AGC TAC GAC ACT AAC TGC ACC 795 
Glu Gly Pro Glu Leu Leu Gly Leu Ala Ser Tyr Asp Thr Asn Cys Thr 
205 210 215 

TGAAACAACT GGCAAAACCT GAAAATGAAG AATTGGGGGG TGACCTTGTA AGAACACTTC 855 

ACCACTTTAT CAAATATCAC ATCEATTATG TAATAAGTAT ATATATGTAG TAAAAACAAA 915 

AAAAATGAAG AATCGAATCG CTAATATCAT CTO3TCTCAA TTCAGAACTT CGAGGTCTST 975 

ATGTAAAATT TCTAAATSCG ATTTTCGCTT ACTGTAATGT TCXX?TTCTCG GATTCTGAGA 1035 

AGTAACATTT GTATTGGTAT GCTATCAAGT ToTTCTGCCT TCTCTGCAAA AAAAAAAAAA 1095 

AAAAAAAAAA A 1106 



(2) INFORMATION FOR SBQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LEN3IH: 218 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

Met Ala Ser Arg Asn Tyr Arg Trp Glu Leu Phe Ala Ala Ser Leu Thr 
15 10 15 



Leu Thr Leu Ala Leu lie His Leu Val Glu Ala Asn Ser Glu Gly Asp 
20 25 30 
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Ala Leu Tyr Ala Leu Arg Arg Ser Leu Thr Asp Pro Asp His Val Leu 
35 40 45 

Gin Ser lip Asp Pro Thr Leu Val Asn Pro Cys Thr Trp Phe His Val 
50 55 60 

Thr Cys Asn Gin Asp Asn Arg Val lhr Arg Val Asp Leu Gly Asn Ser 



65 



70 



75 



80 



Asn Leu Ser Gly His Leu Ala Pro Glu Leu Gly Lys Leu Glu His Leu 
85 90 95 

Gin Tyr Leu Glu Leu Tyr Lys Asn Asn He Gin Gly Thr He Pro Ser 
100 105 110 

Glu Leu Gly Asn Leu Lys Asn Leu He Ser Leu Asp Leu Tyr Asn Asn 
115 120 125 

Asn Leu Thr Gly He Val Pro Thr Phe Leu Gly Lys Leu Lys Ser Leu 
130 135 140 

Val Phe Leu Arg Leu Asn Asp Asn Arg Leu Thr Gly Pro He Leu Glu 
145 150 155 160 

His Ser Arg Gin Ser Gin Ala Phe Lys Val Val Asp Val Ser Ser Asn 
165 170 175 

Asp Leu Cys Gly Thr He Pro Thr Asn Gly Pro Phe Ala His He Pro 
180 185 190 

Leu Gin Asn Phe Glu Asn Asn Pro Arg Leu Glu Gly Pro Glu Leu Leu 
195 200 205 



Gly Leu Ala Ser Tyr Asp Thr Asn Cys Thr 
210 215 



(2) INFORMATION FOR SBQ ID NO: 24: 
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(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 981 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to iriRNA 
(iii) HYPOTHETICAL: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

<B) LOCATION: 104.. 757 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

AGTCTGAGTA ATTTAGTTTG CITTCTCCTC TTICTICAGA AAATTTTCCC TTTACTCTCA 60 

AATTCCTTTT CGATTTCCCT CTCTTAAACC TCCGAAAGCT CAC ATG GCG TCT CGA 115 

Met Ala Ser Arg 
1 

AAC TAT COG TGG GAG CTC TTC GCA GCT TCG TTA ACC CTA ACC TTA GCT 163 
Asn Tyr Arg Trp Glu Leu Phe Ala Ala Ser Leu Thr Leu Thr Leu Ala 
5 10 15 20 

TTC ATT CAC OTG GTC GAA GCA AAC TCC GAA GGA GAT GCT CTC TAC GOT 211 
Leu lie His Leu Val Glu Ala Asn Ser Glu Gly Asp Ala Leu Tyr Ala 
25 30 35 

CTT CGC CGG ACT TTG ACA GAT CCA GAC CAT GTC CTC CAG AGC TGG GAT 259 
Leu Arg Arg Ser Leu Thr Asp Pro Asp His Val Leu Gin Ser Trp Asp 
40 45 50 
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CCA ACT CTT GTT aat cct tgt acc tgg ttc cat gtc acc TGT AAC CAA 
Pro Thr Leu Val Asn Pro Cys Thr Trp Phe His Val Thr Cys Asn Gin 
55 60 65 



307 



GAC AAC CGC GTC ACT CGT GTG GAT TTG GGA AAT TCA AAC CTC TCT GGA 
Asp Asn Arg Val Thr Arg Val Asp Leu Gly Asn Ser Asn Leu Ser Gly 
70 75 80 



355 



CAT CTT GCG CCT GAG CTT GGG AAG CTT GAA CAT TTA CAG TAT CTA GAG 
His Leu Ala Pro Glu Leu Gly Lys Leu Glu His Leu Gin Tyr Leu Glu 
85 90 95 100 



403 



CTC TAC AAA AAC AAC ATC CAA GGA ACT ATA CCT TCC GAA CTT GGA AAT 
Leu Tyr Lys Asn Asn He Gin Gly Thr He Pro Ser Glu Leu Gly Asn 
105 HO I 15 



451 



CTG AAG AAT CTC ATC AGC TTG GAT CTG TAC AAC AAC AAT CTT ACA GGG 
Leu Lys Asn Leu He Ser Leu Asp Leu Tyr Asn Asn Asn Leu Thr Gly 
120 125 130 



499 



ATA GTT CCC ACT TCT TTG GGA AAA TTG AAG TCT CTG GTC TTT TTA CGG 
He Val Pro Thr Ser Leu Gly Lys Leu Lys Ser Leu Val Phe Leu Arg 
135 140 145 



547 



CTT AAT GAC AAC CGA TTG ACC GGT CCA ATC CCT AGA GCA CTC ACG GCA 
Leu Asn Asp Asn Arg Leu Thr Gly Pro He Pro Arg Ala Leu Thr Ala 
150 155 160 



595 



ATC CCA AGC CTT AAA GTT GTT GAC GTC TCA AGC AAT GAT TTG TGT GGA 
He Pro Ser Leu Lys Val Val Asp Val Ser Ser Asn Asp Leu Cys Gly 
165 170 175 180 



643 



ACA ATC CCA ACA AAC GGA CCC TTT GCT CAC ATT CCT TTA CAG AAC TTT 
Thr He Pro Thr Asn Gly Pro Phe Ala His He Pro Leu Gin Asn Phe 
185 190 195 



691 



GAG AAC AAC CCG AGA TTG GAG GGA COG GAA TTA CTC GGT CTT GCA AGC 



739 
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Glu Asn Asn Pro Arg Leu Glu Gly Pro Glu Leu Leu Gly Leu Ala Ser 
200 205 210 



TAC GAC ACT AAC TGC ACC TGAAACAACT GGCAAAACCT GAAAATGAAG 
Tyr Asp Thr Asn Cys Thr 
215 



AAATTACTCA CACT 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LEW3TH: 218 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

Met Ala Ser Arg Asn Tyr Arg Trp Glu Leu Phe Ala Ala Ser Leu Thr 
15 10 15 

Leu Thr Leu Ala Leu He His Leu Val Glu Ala Asn Ser Glu Gly Asp 
20 25 30 

Ala Leu Tyr Ala Leu Arg Arg Ser Leu Thr Asp Pro Asp His Val Leu 
35 40 45 



787 



847 



AATTGGGGGG TGACCTTGTA AGAACACTTC ACCACTTTAT CAAATATCAC ATCTATTATG 
TAATAAGTAT ATATATGTAG TAAAAACAAA AAAAATGAAG AATCGAATCG GTAATATCAT 907 
CTGGTCTCAA TTGAGAACTT CGAGGTCTGT ATGTAAAATT TCTAAATGCG ATTTTCGCCT 967 



981 



Gin 



Ser Trp Asp Pro Thr Leu Val Asn Pro Cys Thr Trp Phe His Val 



WO 97/43427 



-78- 



PCT/EP97/02443 



50 



55 



60 



Thr Cys Asn Gin Asp Asn Arg Val Thr Arg Val Asp Leu Gly Asn Ser 

70 75 80 

65 70 

Asn Leu Ser Gly His Leu Ala Pro Glu Leu Gly Lys Leu Glu His Leu 
85 90 95 

Glu Leu Tyr Lys Asn Asn lie Gin Gly Thr lie Pro Ser 



Gin Tyr Leu 

100 



105 110 



Glu Leu Gly Asn Leu Lys Asn Leu lie Ser Leu Asp Leu Tyr Asn Asn 
115 



120 125 



Asn Leu 
130 



Thr Gly He Val Pro Thr Ser Leu Gly Lys Leu Lys Ser Leu 



135 140 



Val Phe Leu Arg 
145 150 



Leu Asn Asp Asn Arg Leu Thr Gly Pro lie Pro Arg 



155 160 



Thr Ala lie Pro Ser Leu Lys Val Val Asp Val Ser Ser Asn 



Ala Leu 

165 



170 175 



Asp Leu Cys Gly Thr lie Pro Thr Asn Gly Pro Phe Ala His He Pro 
180 



185 19° 



Leu Gin Asn Phe Glu Asn Asn Pro Arg Leu Glu Gly Pro Glu Leu Leu 
195 200 205 

Gly Leu Ala Ser Tyr Asp Thr Asn Cys Thr 
210 215 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 789 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEENESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to iriRNA 
(iii) HYPOTHETICAL: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 2.. 661 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

T CGA CCC ACG CGT CCG CGA AAC TAT CGG TGG GAG CTC TTC GCA GCT 
Arg Pro Thr Arg Pro Arg Asn Tyr Arg Trp Glu Leu Phe Ala Ala 
1 5 10 15 

TCG TTA ATC CTA ACC TTA GCT TTG ATT CAC CTG GTC GAA GCA AAC TCC 
Ser Leu He Leu Thr Leu Ala Leu He His Leu Val Glu Ala Asn Ser 
20 25 30 

GAA GGA GAT GCT CTT TAC GCT CTT CGC CGG AGT TTA ACA GAT CCG GAC 
Glu Gly Asp Ala Leu Tyr Ala Leu Arg Arg Ser Leu Thr Asp Pro Asp 
35 40 45 

CAT GTT CTC CAG AGC TGG GAT CCA ACT CTT GTT AAT CCT TGT ACC TGG 
His Val Leu Gin Ser Trp Asp Pro Thr Leu Val Asn Pro Cys Thr Trp 
50 55 60 

TTC CAT GTC ACC TGT AAC CAA GAC AAC CGC GTC ACT CGT GT3 GAT TTG 
Phe His Val Thr Cys Asn Gin Asp Asn Arg Val Thr Arg Val Asp Leu 
65 70 75 

GGG AAT TCA AAC CTC TCT GGA CAT CTT GCG CCT GAG CTT GGG AAG CTT 
Gly Asn Ser Asn Leu Ser Gly His Leu Ala Pro Glu Leu Gly Lys Leu 



46 



94 



142 



190 



238 



286 
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80 8b 



95 



■„» His Leu O, T*r Leu ate L. Wr Lys M. Gin Gly Or 



100 



105 



ATA CCT TCC GAA CTT GGA AAT CTG AAG AAT CTC ATC AGC TIG GAT CTG 
He Pro Ser Glu Leu Gly Asn Leu Lys Asn Leu He Ser Leu Asp Hu 



115 



120 125 



TAC AAC AAC AAT CTT ACA GGG ATA GTT CCC ACT TCT TTG GGA AAA TTG 
Tyr Asn Asn Asn Leu Thr Gly He Val Pro Thr Ser Leu Gly Lys Leu 
130 135 140 

AAG TCT CTG GTC TTT TTA CGG CTT AAT GAC AAC CGA TTG ACG GGG CCA 
Lys ser Leu Val Phe Leu Arg Leu Asn Asp Asn Arg Leu Thr Gly Pro 



145 



150 155 



ATC CCT AGA GCA CTC ACT GCA ATC CCA AGC CTT AAA GTT GTT GAT GTC 
lie Pro Arg Ala Leu Thr Ala He Pro Ser Leu Lys Val Val Asp Val 

ifi? 170 175 

160 165 

TCA AGC AAT GAT TTG TGT GGA ACA ATC CCA ACA AAC GGA CCT TTT GCT 
Ser Ser Asn Asp Leu Cys Gly Thr lie Pro Thr Asn Gly Pro Phe Ala 

190 



180 



185 



CAC ATT CCT TTA CAG AAC TTT GAG AAC AAC COG AQG TTG GAG GGA CCG 
His lie Pro Leu Gin Asn Phe Glu Asn Asn Pro Arg Leu Glu Gly Pro 

nr\K 

195 



200 205 



GAA TTA CTC GGT CTT GCA AGC TAC GAC ACT AAC TGC AGC T3AAAAAATT 
Glu Leu Leu Gly Leu Ala Ser Tyr Asp Thr Asn Cys Thr 



210 



215 220 



334 



GGCAAAACCT GAAAATGAAG AATTGGGGGG TGACCTTGTA AGAACACTTC ACCACTTTAT 
CAAATATCAC ATCTACTATG TAATAAGTAT ATATATGTAG TCCAAAAAAA AAAAAAAA 



382 



430 



478 



526 



574 



622 



671 



731 
789 
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(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 220 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

Arg Pro Thr Arg Pro Arg Asn Tyr Arg Trp Glu Leu Phe Ala Ala Ser 
15 10 15 

Leu He Leu Thr Leu Ala Leu He His Leu Val Glu Ala Asn Ser Glu 
20 25 30 

Gly Asp Ala Leu Tyr Ala Leu Arg Arg Ser Leu Thr Asp Pro Asp His 
35 40 45 

Val Leu Gin Ser Trp Asp Pro Thr Leu Val Asn Pro Cys Thr Trp Phe 
50 55 60 

His Val Thr Cys Asn Gin Asp Asn Arg Val Thr Arg Val Asp Leu Gly 
65 70 75 80 

Asn Ser Asn Leu Ser Gly His Leu Ala Pro Glu Leu Gly Lys Leu Glu 
85 90 95 

His Leu Gin Tyr Leu Glu Leu Tyr Lys Asn Asn He Gin Gly Thr He 
100 105 HO 

Pro Ser Glu Leu Gly Asn Leu Lys Asn Leu He Ser Leu Asp Leu Tyr 
115 120 125 
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Asn Asn Asn Leu Tnr Gly He Val Pro Thr Ser Leu Gly Lys Leu Lys 
130 135 "0 

Ser Leu Val Phe Leu Arg Leu Asn Asp Asn Arg Leu Thr Gly Pro He 
145 150 155 I 60 

Pro Arg Ala Leu Thr Ala lie Pro Ser Leu Lys Val Val Asp Val Ser 



165 



170 



175 



Ser 



Asn Asp Leu Cys Gly Thr lie Pro Thr Asn Gly Pro Phe Ala His 



180 



185 



190 



He Pro Leu Gin Asn Phe Glu Asn Asn Pro Arg Leu Glu Gly Pro Glu 
195 200 205 



Leu Leu Gly Leu Ala Ser Tyr Asp Thr Asn Cys Thr 
210 215 220 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 894 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDECNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cENA to mRNA 



(iii) HYPOTHETICAL: NO 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..675 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
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GGA COG ATT CAA GCC TCC GAA GGG GAC GCT CTT CAC GCG CTT CGC CGG 
Gly Pro lie Gin Ala Ser Glu Gly Asp Ala Leu His Ala Leu Arg Arg 
15 10 15 



48 



AGC TTA TCA GAT CCA GAC AAT GIT CTT CAG ACT TGG GAT CCA ACT CTT 
Ser Leu Ser Asp Pro Asp Asn Val Val Gin Ser Trp Asp Pro Thr Leu 
20 25 30 



96 



GTT AAT CCT TCT ACT TGG ITT CAT GTCC ACT TCT AAT CAA CAC CAT CAA 144 
Val Asn Pro Cys Thr Trp Phe His Val Thr Cys Asn Gin His His Gin 
35 40 45 



GTC ACT CGT CTG GAT TIG GGG AAT TCA AAC TTA TCT GGA CAT CTA CTA 192 
Val Thr Arg Leu Asp Leu Gly Asn Ser Asn Leu Ser Gly His Leu Val 
50 55 60 



CCT GAA CTT GGG AAG CTT GAA CAT TTA CAA TAT CTG TAT GGA ATC ATC 240 
Pro Glu Leu Gly Lys Leu Glu His Leu Gin Tyr Leu Tyr Gly lie lie 
65 70 75 80 



ACT CTT TT3 CCT ITT GAT TAT CTG AAA ACA TIT ACA TTA TCA GTC ACA 288 
Thr Leu Leu Pro Phe Asp Tyr Leu Lys Thr Phe Thr Leu Ser Val Thr 
85 90 95 



CAT ATA ACA TTT TGC TTT GAG TCA TAT ACT GAA CTC TAC AAA AAC GAG 336 
His lie Thr Phe Cys Phe Glu Ser Tyr Ser Glu Leu Tyr Lys Asn Glu 
100 105 110 



ATT CAA GGA ACT ATA CCT TCT GAG CTT GGA AAT CTG AAG ACT CTA ATC 384 
lie Gin Gly Thr lie Pro Ser Glu Leu Gly Asn Leu Lys Ser Leu lie 
115 120 125 

AGT TTG GAT CTG TAC AAC AAC AAT CTC ACC GGG AAA ATC CCA TCT TCT 432 
Ser Leu Asp Leu Tyr Asn Asn Asn Leu Thr Gly Lys lie Pro Ser Ser 
130 135 140 
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TTG GGA AAA TTG AAG TCA CTT GTT TIT TTG CGG CTT AAC GAA AAC CGA 
Leu Gly Lys Leu Lys Ser Leu Val Phe Leu Arg Leu Asn Glu Asn Arg 
145 

TTG ACC GGT CCT ATT CCT AGA GAA CTC ACA GTT ATT TCA AGC CTT AAA 
Leu Thr Gly Pro He Pro Arg Glu Leu Thr Val He Ser Ser Leu Lys 



480 



165 



170 175 



GTT GTT GAT GTC TCA GGG AAT GAT TTG TGT GGA ACA ATT CCA GTA GAA 
Val Val Asp Val Ser Gly Asn Asp Leu Cys Gly Thr He Pro Val Glu 
180 185 190 

GGA CCT TTT GAA CAC ATT CCT ATG CAA AAC TTT GAG AAC AAC CTG AGA 
Gly Pro Phe Glu His lie Pro Met Gin Asn Phe Glu Asn Asn Leu Arg 
195 200 205 

TTG GAG GGA CCA GAA CTA CTA GGT CTT GCG AGC TAT GAC ACC AAT TGC 
Leu Glu Gly Pro Glu Leu Leu Gly Leu Ala Ser Tyr Asp Thr Asn Cys 



210 



215 220 



ACT TAAAAAGAAG TTGAAGAACC TATAAAGAAG AATGTTAGGT GACCTTGTAA 

Thr 

225 

GAACTCTGTA CCAAGTGTTT GTAAATCTAT ATAGAGCCTT GTTTCATGTT ATATATGAAA 
GCTTTGAGAG ACAGTAACTT GCAATGTATT GGTATTGGTA GAAAAAGTTG AAATGAGAAT 
TGCTTTGTAA TTGGATTTGT GTTTCTTATG TAACTTGAAT TTCTTATTA 



528 



576 



624 



672 



725 



785 



845 



894 



(2) INFORMATION FOR SBQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 225 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

Gly Pro He Gin Ala Ser Glu Gly Asp Ala Leu His Ala Leu Arg Arg 
15 10 15 

Ser Leu Ser Asp Pro Asp Asn Val Val Gin Ser Trp Asp Pro Thr Leu 
20 25 30 

Val Asn Pro Cys Thr Trp Phe His Val Thr Cys Asn Gin His His Gin 
35 40 45 

Val Thr Arg Leu Asp Leu Gly Asn Ser Asn Leu Ser Gly His Leu Val 
50 55 60 

Pro Glu Leu Gly Lys Leu Glu His Leu Gin Tyr Leu Tyr Gly He He 
65 70 75 80 

Thr Leu Leu Pro Phe Asp Tyr Leu Lys Thr Phe Thr Leu Ser Val Thr 
85 90 95 

His He Thr Phe Cys Phe Glu Ser Tyr Ser Glu Leu Tyr Lys Asn Glu 
100 105 HO 

He Gin Gly Thr He Pro Ser Glu Leu Gly Asn Leu Lys Ser Leu He 
115 120 125 

Ser Leu Asp Leu Tyr Asn Asn Asn lieu Thr Gly Lys lie Pro Ser Ser 
130 135 140 

Leu Gly Lys Leu Lys Ser Leu Val Phe Leu Arg Leu Asn Glu Asn Arg 
145 150 155 160 

Leu Thr Gly Pro He Pro Arg Glu Leu Thr Val He Ser Ser Leu Lys 
165 170 I 75 
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Val Val Asp 

180 



Val Ser Gly Asn Asp Leu Cys Gly Thr He Pro Val Glu 



185 19° 



Gly Pro 

195 



Phe Glu His lie Pro Met Gin Asn Phe Glu Asn Asn Leu Arg 



200 205 



Leu Glu Gly Pro Glu Leu Leu Gly Leu Ala Ser Tyr Asp Thr Asn Cys 
210 



215 220 



Thr 
225 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1063 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: cENA to iriRNA 
(iii) HYPOTHETICAL: NO 



(ix) FEATORE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 106.. 759 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
TCGACCCACG CGTCCGACGA AACCCIAATT TIGCTTCCTC ATOTICTTCA GAAAATTACT 

CAAATTCCTA TTAGATTACT CTCTCTTCGA CCTCCGATAG CTCAC ATG GCG TCT 

Met Ala Ser 



60 
114 
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1 



CGA AAC TAT CGG TGG GAG CTC TTC GCA GCT TCG TTA ATC CTA ACC TTA 
Arc, Asn Tyr Arg Trp Glu Leu Phe Ala Ala Ser Leu He Leu Thr Leu 
5 10 15 

GCT TTG ATT CAC CTG GTC GAA GCA AAC TCC GAA GGA GAT GCT CTT TAC 
Ala Leu He His Leu Val Glu Ala Asn Ser Glu Gly Asp Ala Leu Tyr 
20 25 30 35 

GCT CTT CGC CGG AGT TTA ACA GAT CCG GAC CAT GTT CTC CAG AGC TGG 
Ala Leu Arg Arg Ser Leu Thr Asp Pro Asp His Val Leu Gin Ser Trp 
40 « 50 

GAT CCA ACT CTT GTT AAT CCT TGT ACC TGG TTC CAT GTC ACC TGT AAC 
Asp Pro Thr Leu Val Asn Pro Cys Thr Trp Phe His Val Thr Cys Asn 
55 60 65 

CAA GAC AAC CGC GTC ACT CGT GTG GAT TTG GGG AAT TCA AAC CTC TCT 
Gin Asp Asn Arg Val Thr Arg Val Asp Leu Gly Asn Ser Asn Leu Ser 
70 75 80 

GGA CAT CTT GCG CCT GAG CTT GGG AAG CTT GAA CAT TTA CAG TAT CTA 
Gly His Leu Ala Pro Glu Leu Gly Lys Leu Glu His Leu Gin Tyr Leu 
85 90 95 

GAG CTC TAC AAA AAC AAC ATC CAA GGA ACT ATA CCT TCC GAA CTT GGA 
Glu Leu Tyr Lys Asn Asn He Gin Gly Thr He Pro Ser Glu Leu Gly 
100 



162 



105 HO H5 



AAT CTG AAG AAT CTC ATC AGC TTG GAT CTG TAC AAC AAC AAT CTT ACA 
toLeuLysAsnLeuIleSerLeuAspIaiTVrAsnAsnAsnlaxUir 

120 125 I 30 

GGG ATA GTT CCC ACT TCT TTG GGA AAA TTG AAG TCT CTG GTC TTT TTA 
Gly He Val Pro Thr Ser Leu Gly Lys Leu Lys Ser Leu Val Phe Leu 
335 140 I 45 



210 



258 



306 



354 



402 



450 



498 



546 
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, mr rcA ATC CCT AGA GCA CTC ACT 

CGG CTT AAT GAC AAC CGA TTG ACG GQG CCA KK. «.i 

» &m &m l«i Thr Gly Pro He Pro Arg Ala Leu Thr 
Arg Leu Asn Asp Asn Arg Leu inr ^J-y 



150 



155 



160 



GCA ATC CCA AQC CTT AAA GTT GTT GAT GTC TCA AGC AAT GAT TTG TGT 
Ala lie Pro Ser Leu Lys Val Val Asp Val Ser Ser Asn Asp Leu Cys 



165 



170 



175 



GGA ACA ATC CCA ACA AAC GGA CCT TTT GCT CAC ATT CCT TTA CAG AAC 
Gly Thr lie Pro Thr Asn Gly Pro Phe Ala His lie Pro Leu Gin Asn 



180 



185 



190 



TTT GAG AAC AAC CCG AGG TTG GAG GGA CCG GAA TTA CTC GGT CTT GCA 
Phe Glu Asn Asn Pro Arg Leu Glu Gly Pro Glu Leu Leu Gly Leu Ala 
200 



205 210 



AGC TAC GAC ACT AAC TGC ACC TGAAAAAATT GGCAAAACCT GAAAATGAAG 

Ser Tyr Asp Thr Asn Cys Thr 
215 

AATTGGGGGG TGACCTTGTA AGAACACTTC ACCACITTAT CAAATATCAC ATCTACTATG 
TAATAAGTAT ATATATGTAG TCCAAAAAAA AAATGAAGAA TCGAATCACT AATATCATCT 
GCTCICAATT GAGAACTITC AGCTCMCTGT ATGTAAAATT TCTAAATGCG ACTTKX3COT 
ACTGTAATGfr TCGGTTCITGG GATTCTGAGA AGTAACATTT GTATTGOTAT GGTATCAAGT 
TX3ITCTGCCT TCTCTGCAAA AAAAAAAAAA AAAA 



594 



642 



690 



738 



789 



849 
909 
969 
1029 
1063 



(2) INFORMATION FOR SBQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LEN3IH: 218 amino acids 

(B) TYPE: amino acid 



PCT/EP97/02443 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

Met Ala Ser Arg Asn Tyr Arg Trp Glu Leu Phe Ala Ala Ser Leu lie 
1 5 10 15 

Ala Leu He His Leu Val Glu Ala Asn Ser Glu Gly Asp 



Leu Thr Leu 

20 



25 30 



Ala Leu 

35 



Tyr Ala Leu Arg Arg Ser Leu Thr Asp Pro Asp His Val Leu 



40 45 



Gin Ser Trp Asp Pro Thr Leu Val Asn Pro Cys Thr Trp Phe His Val 
50 55 60 



Thr Cys Asn Gin Asp Asn Arg Val Thr Arg Val Asp Leu Gly Asn Ser 
65 70 



75 8° 



Asn Leu Ser Gly His Leu Ala Pro Glu Leu Gly Lys Leu Glu His Leu 
85 



90 95 



Gin Tyr Leu Glu Leu Tyr Lys Asn Asn He Gin Gly Thr lie Pro Ser 
100 



105 H° 



Glu Leu Gly Asn Leu Lys Asn Leu He Ser Leu Asp Leu Tyr Asn Asn 
115 120 125 

Asn Leu Thr Gly He Val Pro Thr Ser Leu Gly Lys Leu Lys Ser Leu 
130 135 140 



Val Phe Leu Arg Leu Asn Asp Asn Arg Leu Thr Gly Pro He Pro Arg 
145 150 



155 160 



Ala Leu Thr Ala He Pro Ser Leu Lys Val Val Asp Val Ser Ser Asn 
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165 



170 



175 



Asp Leu Cys Gly Thr lie Pro Thr Asn Gly Pro Phe Ala His lie Pro 

185 190 



180 



Leu Gin Asn 
195 



Phe Glu Asn Asn Pro Arg Leu Glu Gly Pro Glu Leu Leu 
200 205 



Gly Leu Ala Ser Tyr Asp Thr Asn Cys Thr 
210 215 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2089 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cENA to iriRNA 
(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Arabidopsis thaliana 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: SERK gene cENA 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 195.. 2069 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
GGATTTTTAT TITATTTTTT ACTCTTTGTT TGITTTAATG CTAATGGGTT TTTAAAAGGG 
TTATCGAAAA AATGAGTGAG TTTGTGTTGA. GGTTGTCTCT CTAAAGTGTT AATGGTGGTG 
ATTTTCGGAA GTTAGGGTTT TCTCGGATCT GAAGAGATCA AATCAAGATT CGAAATTTAC 

cattgttgtt tgaa atg gag tog act tat gig gig ttt atc tta err tca 

Met Glu Ser Ser Tyr Val Val Phe lie Leu Leu Ser 
1 5 10 

CTG ATC TTA CTT CCG AAT CAT TCA CTG TGG CTT GCT TCT GCT AAT TTG 
Leu lie Leu Leu Pro Asn His Ser Leu Txp Leu Ala Ser Ala Asn Leu 
15 20 25 

GAA GGT GAT GCT TTG CAT ACT TTG AGG GTT ACT CTA GTT GAT CCA AAC 
Glu Gly Asp Ala Leu His Thr Leu Arg Val Thr Leu Val Asp Pro Asn 
30 35 40 

AAT GTC TTG CAG AGC TGG GAT CCT ACG CTA GTG AAT CCT TGC ACA TGG 
Asn Val Leu Gin Ser Trp Asp Pro Thr Leu Val Asn Pro Cys Thr Trp 



45 



50 



55 



60 



TTC CAT GTC ACT TGC AAC AAC GAG AAC AGT GTC ATA AGA GTT GAT TTG 
Phe His Val Thr Cys Asn Asn Glu Asn Ser Val He Arg Val Asp Leu 



65 



70 



75 



GGG AAT GCA GAG TTA TCT GGC CAT TTA GTT CCA GAG CTT GGT GTG CTC 
Gly Asn Ala Glu Leu Ser Gly His Leu Val Pro Glu Leu Gly Val Leu 



85 9° 



80 

AAG AAT TTG CAG TAT TTG GAG CTT TAC AGT AAC AAC ATA ACT GGC CCG 
Gin Tyr Leu Glu Leu Tyr Ser Asn Asn He Thr Gly Pro 



Lys Asn Leu 
95 



100 105 



ATT CCT AGT AAT CTT GGA AAT CTG ACA AAC TTA GTG AGT TTG GAT CTT 



60 



120 



180 



230 



278 



326 



374 



422 



470 



518 



566 
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on T-on Thr Asn Leu Val Ser Leu Asp Leu 

He Pro Ser Asn Leu Gly Asn Leu Tnr asn i^u v 

110 120 

TAC TTA AAC AGC TTC TCC GGT COT ATT CCG GAA TCA TTG GGA AAG CTT 
» Leu Asn Ser Phe Ser Gly Pro He Pro Glu Ser Leu Gly Lys Leu 

nc 140 

125 130 135 

TCA AAG CTG AGA ITT CTC CGG CTT AAC AAC AAC ACT CTC ACT GGG TCA 
Ser Lys Leu Arg Phe Leu Arg Leu Asn Asn Asn Ser Leu Thr Gly Ser 
145 150 155 

ATT CCT ATG TCA CTG ACC AAT ATT ACT ACC CTT CAA GTG TTA GAT CTA 
lie Pro Met Ser Leu Thr Asn He Thr Thr I*u Gin Val Leu Asp Leu 
160 165 170 

TCA AAT AAC AGA CTC TCT GGT TCA GTT CCT GAC AAT GGC TCC TTC TCA 
Ser Asn Asn Arg Leu Ser Gly Ser Val Pro Asp Asn Gly Ser Phe Ser 
175 180 185 

CTC TTC ACA CCC ATC ACT TTT GOT AAT AAC TTA GAC CTA TGT GGA CCT 
Leu Phe Thr Pro He Ser Phe Ala Asn Asn Leu Asp Leu Cys Gly Pro 
190 195 200 

GTT ACA AGT CAC CCA TGT CCT GGA TCT CCC CCG TTT TCT CCT CCA CCA 

Val Thr Ser His Pro Cys Pro Gly Ser Pro Pro Phe Ser Pro Pro Pro 

n 215 220 

205 210 

CCT TTT ATT CAA CCT CCC CCA GTT TCC ACC CCG ACT GGG TAT GGT ATA 
Pro Phe lie Gin Pro Pro Pro Val Ser Thr Pro Ser Gly Tyr Gly He 
225 2 30 235 

ACT GGA GCA ATA GCT GGT GGA GTT GOT GCA GGT GCT GCT TTG CCC TTT 
Thr Gly Ala He Ala Gly Gly Val Ala Ala Gly Ala Ala Leu Pro Phe 
240 2 « 250 

GCT GCT CCT GCA ATA GGC TTT GCT TGG TGG CGA GGA AGA AGC CCA CTA 
Ala Ala Pro Ala He Ala Phe Ala Trp Trp Arg Arg Arg Ser Pro Leu 



614 



662 



710 



758 



806 



854 



902 



950 



998 
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255 260 265 

GAT ATT TTC TIC GAT GTC CCT GCC GAA GAA GAT CCA GAA GTT CAT CTG 
Asp lie Phe Phe Asp Val Pro Ala Glu Glu Asp Pro Glu Val His Leu 

one 280 

270 275 

QGA CAG CTC AAG AQG TTT TCT TTG CGG GAG CTA CAA GTG GCG AGT GAT 
Gly cm Leu Lys Arg Phe Ser Leu Arg Glu Leu Gin Val Ala Ser Asp 

290 295 300 

GGG TTT AGT AAC AAG AAC ATT TTG GGC AGA QGT GQG TTT GGG AAA GTC 
Gly Phe Ser Asn Lys Asn lie Leu Gly Arg Gly Gly Phe Gly Lys Val 



305 



310 315 



TAC AAG GGA CGC TTG GCA GAC GGA ACT CTT GTT GCT GTC AAG AGA CTG 
.yr Lys Gly Arg Leu Ala Asp Gly Thr Leu Val Ala Val Lys Arg Leu 
320 



325 330 



AAG GAA GAG CGA ACT CCA GGT GGA GAG CTC CAG TTT CAA ACA GAA GTA 
Lys Glu Glu Arg Thr Pro Gly Gly Glu Leu Gin Phe Gin Thr Glu Val 
335 



340 345 



GAG ATG ATA AGT ATG GCA GTT CAT CGA AAC CTG TTG AGA TTA CGA GGT 
Glu Met lie Ser Met Ala Val His Arg Asn Leu Leu Arg Leu Arg Gly 
350 355 360 



TTC TGT ATG ACA COG ACC GAG AGA TTG CTT GTG TAT CCT TAC ATG GCC 
Phe Cys Met Thr Pro Thr Glu Arg Leu Leu Val Tyr Pro Tyr Met Ala 
365 370 



375 380 



AAT GGA AGT GTT GCT TOG TGT CTC AGA GAG AGG CCA CCG TCA CAA CCT 
Asn Gly Ser Val Ala Ser Cys Leu Arg Glu Arg Pro Pro Ser Gin Pro 
385 



390 395 



CCG CTT GAT TOG CCA AGG CGG AAG AGA ATC GCG CTA GGC TCA GCT CGA 
Pro Leu Asp Trp Pro Thr Arg Lys Arg lie Ala Leu Gly Ser Ala Arg 
400 



405 410 



1046 



1094 



1142 



1190 



1238 



1286 



1334 



1382 



1430 
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GGT TTG TCT TAC CTA CAT GAT CAC TGC GAT CCG AAG ATC ATT CAC CGT 
Gly Leu Ser Tyr Leu His Asp His Cys Asp Pro Lys lie lie His Arg 
415 420 425 

GAC GTA AAA GCA GCA AAC ATC CTC TTA GAC GAA GAA TTC GAA GCG GTT 
Asp val Lys Ala Ala Asn lie Leu Leu Asp Glu Glu Phe Glu Ala Val 
430 435 440 

GTT GGA GAT TTC GGG TTG GCA AAG CTT ATG GAC TAT AAA GAC ACT CAC 
Val Gly Asp Phe Gly Leu Ala Lys Leu Met Asp Tyr Lys Asp Thr His 
445 450 455 460 

GTG ACA ACA GCA GTC CGT GGC ACC ATC GGT CAC ATC GCT CCA GAA TAT 
Val Thr Thr Ala Val Arg Gly Thr lie Gly His He Ala Pro Glu Tyr 
465 470 475 

CTC TCA ACC GGA AAA TCT TCA GAG AAA ACC GAC GTT TTC GGA TAC GGA 
Leu ser Thr Gly Lys Ser Ser Glu Lys Thr Asp Val Phe Gly Tyr Gly 
480 485 490 

ATC ATG CTT CTA GAA CTA ATC ACA GGA CAA AGA GCT TTC GAT CTC GCT 
He Met Leu Leu Glu Leu lie Thr Gly Glu Arg Ala Phe Asp Leu Ala 
495 500 505 

COG CTA GCT AAC GAC GAC GAC GTC ATG TTA CTT GAC TGG GTG AAA GGA 
Arg Leu Ala Asn Asp Asp Asp Val Met Leu Leu Asp Trp Val Lys Gly 
510 515 520 

TTG TTG AAG GAG AAG AAG CTA GAG ATG TTA GTG GAT CCA GAT CTT CAA 
Leu Leu Lys Glu Lys Lys Leu Glu Met Leu Val Asp Pro Asp Leu Gin 

c ,„ 535 540 

525 530 MS 

ACA AAC TAC GAG GAG AGA GAA CTG GAA CAA GTG ATA CAA GTG GOG TTG 
Thr Asn Tyr Glu Glu Arg Glu Leu Glu Gin Val lie Gin Val Ala Leu 
545 550 555 



1478 



1526 



1574 



1622 



1670 



1718 



1766 



1814 



1862 
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CTA TGC ACG CAA GGA TCA CCA ATG GAA AGA CCA AAG ATS TCT GAA GTT 
Leu cys Thr Gin Gly Ser Pro Met Glu Arg Pro Lys Met Ser Glu Val 



1910 



560 



565 570 



CTA AGG ATG CTG GAA GGA GAT GGG CTT GCG GAG AAA TGG GAC GAA TGG 
Val Arg Met Leu Glu Gly Asp Gly Leu Ala Glu Lys Trp Asp Glu Trp 
575 



580 585 



620 



CAA AAA GTT GAG ATT TTG AGG GAA GAG ATT GAT TTG AGT CCT AAT CCT 
Gin Lys Val Glu He Leu Arg Glu Glu He Asp Leu Ser Pro Asn Pro 
590 595 600 

AAC TCT GAT TGG ATT CTT GAT TCT ACT TAC AAT TTG CAC GCC GTT GAG 
Asn Ser Asp Trp He Leu Asp Ser Thr Tyr Asn Leu His Ala Val Glu 
605 610 615 

TTA TCT GGT CCA AGG TAAAAAAAAA AAAAAAAAAA 
Leu Ser Gly Pro Arg 
625 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 625 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

Met Glu Ser Ser Tyr Val Val Phe He Leu Leu Ser Leu He Leu Leu 
15 10 15 

Pro Asn His Ser Leu Trp Leu Ala Ser Ala Asn Leu Glu Gly Asp Ala 
20 25 30 



1958 



2006 



2054 



2089 
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Leu His Thr Leu Arg 
35 



Val Thr Leu Val Asp Pro Asn Asn Val Leu Gin 



40 



45 



Ser Trp Asp Pro Thr Leu Val Asn Pro Cys Thr Trp Phe His Val Thr 
50 55 

Cys Asn Asn Glu Asn Ser Val lie Arg Val Asp Leu Gly Asn Ala Glu 



65 



70 



75 



80 



Leu Ser Gly His Leu Val Pro Glu Leu Gly Val Leu Lys Asn Leu Gin 
85 



90 95 



Tyr Leu Glu Leu Tyr Ser Asn Asn lie Thr Gly Pro lie Pro Ser Asn 
100 



105 U° 



Leu Gly Asn Leu Thr Asn Leu Val Ser Leu Asp Leu Tyr Leu Asn Ser 
115 



120 125 



Phe Ser Gly Pro lie Pro Glu Ser Leu Gly Lys Leu Ser Lys Leu Arg 
130 



135 140 



Phe Leu Arg Leu Asn Asn Asn Ser Leu Thr Gly Ser lie Pro Met Ser 
145 150 



155 160 



Leu Thr Asn lie Thr Thr Leu Gin Val Leu Asp Leu Ser Asn Asn Arg 
165 



170 175 



Leu Ser Gly Ser Val Pro Asp Asn Gly Ser Phe Ser Leu Phe Thr Pro 
180 185 I 90 

He Ser Phe Ala Asn Asn Leu Asp Leu Cys Gly Pro Val Thr Ser His 
195 200 205 

Pro Cys Pro Gly Ser Pro Pro Phe Ser Pro Pro Pro Pro Phe lie Gin 
210 215 220 
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^^ptov^^^^o^ Gly xi. oc Gly «. *j- 

235 Z4U 



225 



230 



Ma Gly Gly Val Ala Ala Gly Ala Ala Leu Pro Phe Ala Ala Pro Ala 

25b 



245 



250 



Ile Ala Phe Ala Trp Trp Arg Arg Arg Ser Pro Leu Asp He Phe Phe 



260 



265 



270 



Asp Val Pro Ala 



Glu Glu Asp Pro Glu Val His Leu Gly Gin Leu Lys 



275 



280 



285 



*g Phe Ser Leu Arg Glu Leu Gin Val Ala Ser Asp Gly Phe Ser Asn 



290 



295 



300 



Lys Asn lie Leu Gly «, Gly Gly «- Gly Lys val Tyr Lys Gly «g 



305 



310 



315 



Leu Ala Asp Gly Thr Leu Val Ala Val Lys Arg Leu Lys Glu Glu Arg 



325 



330 



335 



Tnr Pro Gly Gly Glu Leu Gin Phe Gin Thr Glu Val Glu Met He Ser 



340 



345 



350 



Met Ala Val His Arg Asn Leu Leu Arg Leu Arg Gly Phe Cys Met Thr 



355 



360 



365 



Pro Thr Glu Arg Leu Leu Val Tyr Pro Tyr Met Ala Asn Gly Ser Val 



370 



375 



380 



_ _ _ _,„ AraProProSerGlnProProLeuAspTxp 
Ala Ser Cys Leu Arg Glu Arg fro no 



395 



400 



385 39° 

He Ala Leu Gly Ser Ala Arg Gly Leu Ser Tyr 



Pro Thr Arg Lys Arg 
405 



Leu His Asp His Cys Asp Pro 



410 



415 



Lys He Ile His Arg Asp Val Lys Ala 
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420 



425 



430 



x * m rlu Glu Phe Glu Ala Val Val Gly Asp Phe 
Ala Asn He Leu Leu Asp Glu Glu pne u 

440 445 



435 



Gly Leu Ala ltfs I^u Met Asp Tyr Lys Asp Thr His Val Thr Thr Ala 
Y 460 



450 



455 



val Arg Gly ~ H. «Y «U Xle Ala Pro Glu » - * 
465 «° " 5 

Lys Ser ser Glu Lys B* Asp Val Phe Gly <Vr «y H- ~ "» «- 
485 490 " 



Glu Leu lie Thr Gly Gin Arg Ala Phe Asp I^u Ala Arg Ala Asn 



500 



505 



510 



Asp Asp Asp Val Met l^u l^u Asp Hp val Lys Gly L^fu Lfiu Lys Glu 

520 525 



515 



Lys Lys Leu Glu Met Leu 



Val Asp Pro Asp Leu Gin Tbr Asn Tyr Glu 



530 



535 



540 



Glu Arg Glu Leu Glu Obi Val He CQ» Val Ala Leu Leu Cys Thr 0>» 

ccc 36U 



545 



550 



555 



Gly Ser Pro «t Glu Arg Pro Lys Met ser Glu Val Val Arg Met Leu 



565 



570 



Glu Gly Asp Gly Leu 
580 



Ala Glu Lys Trp Asp Glu Trp Gin Lys Val Glu 



585 



590 



Xle Leu Arg Glu Glu He Asp Leu Ser Pro Asn Pro Asn Ser Asp Trp 



595 



600 



605 



He Leu Asp Ser Thr Tyr Asn 



Leu His Ala Val Glu Leu Ser Gly Pro 



610 



615 



620 
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Arg 
625 



WO 97/43427 



-100- 



PCT/EP97/02443 



REFERENCES 

Aleith, F. and Richter, G. (1990) Pianta 183, 17-24. 

Bent, A.F., Kunkei f B.N., Dahlbeck D. f Brown, K.L, Schmidt, R. f Giraudat, J., Leung, J. and 

Staskawicz, BJ. (1994) Science 265, 1856-1860. 

Braun T, Schofield P.R. and Sprengel R. (1991) EMBO J, 10, 1885-1890. 

Dangl, J.L (1995) Cell 80, 363-366. 

De Jong, A J., Schmidt, E.D.L. and De Vries, S.C. (1993) Plant. Mol. Biol. 5, 367-377. 
De Vries, S.C, Booij, H., Meyerink, P., Huisman, G., Wilde, H.D., Thomas T.L and Van 
Kammen, A. (1988a) Pianta 176, 196-204. 

De Vries, S.C, Hoge, H., and Bisseling, T. (1988b) In Plant Molecular Biology Manual, B6 f 
S.B. Gelvin, R.A, Schilperoort, and D.P.S. Verma. eds (Dordrecht, the Netherlands: Kluwer 
Academic Publishers), pp 1-13. 

Dubois, T., Guedira, M M Dubois, D. and Vasseur, J. (1991) Protoplasma 162, 120-127. 
Dudits, D., Gyorgyey, J., Bogre, L. and Bako, L. (1995) In: In vitro Embryogenesis in Plants. 
Ed. Thorpe, T.A. Kluwer Press, pp. 267-308. 

Engler J.A., Van Montagu M. and Engler G. (1994) Plant Mol. Biol. Rep. 12, 321-331. 
Gamborg, O.L., Miller, R.A. and Ojima, K. (1968) Exp. Cell. Res. 50, 151-158. 
Giorgetti, L f Vergara, M.R., Evangeiista, M M LoSchiavo, F. f Terzi, M. and Ronchi, V.N. 
(1995) Mol. Gen. Genet. 246, 657-662. 

Goldberg, R.B M Barker, S.J. and Perez-Grau, L (1989) Cell 56, 149-160. 

Goldberg, R.B., Barker, S.J., Perez-Grau, L (1989) Cell 56, 149-160. 

Goldberg, R.B., de Paiva, G. and Yadegari, R. (1994) Science 266, 605-614. 

Govind, S. and Steward, R. (1991) Dorsoventral pattern formation in Drosophila. Trends 

Genet. 7,119-125. 

Guzzo, F., Baldan, B„ Levi, M., Sparvoli, E., LoSchiavo, R, Terzi, M., and P. Mariani (1995). 
Protoplasma 185, 28-36. 

Guzzo, F. f Baldan, B., Mariani, P., LoSchiavo, F. and Terzi, M. (1994) Exper. Botany 45, 
1427-1432. 

Hanks, S.K., Quinn, A.M. and Hunter, T. (1988) Science 241, 42-52. 
Hashimoto, C, Hudson, K.L and Anderson K.V. (1988) Cell 52, 269-279. 
Heck, G.R., Perry, S.E., Nichols, K.W. and Fernandez, D.E. (1995) AGL15, a MADS 
domain protein expressed in developing mbryos. Plant Cell 7, 1271-1282. 



WO 97/43427 



-101 - 



PCT7EP97/02443 



Heldin, C-H. (1995) Cell 80, 213-223. 

Hodge, R„ Paul, Wyatt, Draper, J. and Scott, R. (1992) Plant J. 2, 257-260. 

Horn, M.A. and Walker, J.C. (1994) Biochim. Biophys. Acta 1208, 65-74. 

Jones, D.A., Thomas, CM., Hammond-Kosack, K.E., Balint-Kurti, PJ. and Jones, J.D.G, 

(1994) Science 266, 789-792. 

Kobe B. and Deisenhofer J. (1994) TIBS .19, 415-421. 

Li, F., Barnathan, E.S. and Karik6, K. (1994) Nucl. Acid Res. 22, 1764-1765. 

Liang, P. and Pardee, A.B. (1992) Science, 257, 967-971. 

Meyerowitz, E.M. (1995) EDBC 95 congress of the European developmental biology 
organization. Abstract SI4. 

Mu J-H., Lee H-S. and Kao T-h. (1994) Plant Cell 6, 709-721 . 

Pennell, R.I., Janniche, L. t Scofield, G.N., Booij, K, de Vries, S.C. and Roberts, K. (1992) J. 
Cell Biol. 119, 1371-1380. 

Rounsley, S.D., Ditta, G.S. and Yanofsky, M.F. (1995) Plant Cell 7, 1259-1269. 

Sato, S., Toya, T M Kawahara, R., Whittier, R.F., Fukuda, H. and Komamine, A. (1995) Plant 

Mol. Biol. 28, 39-46. 

Shelton, C.A. and Wasserman, S.A. (1993) Cell 72, 515-525. 

Sterk, P. t and De Vries, S.C. (1992) In Redenbaugh K (ed), Synseeds: Applications of 
synthetic seeds to crop improvement, CRC Press, London (1992). 
Sterk, P., Booij, H., Schellekens, G.A., Van Kammen, A. and De Vries, S.C. (1991) Plant 
Cell 3, 907-921. 

Thomas, T.L. (1993) Plant Cell 5, 1401-1410 

Toonen, M.AJ. and De Vries, S.C, (1995) In: Embryogenesis, the generation of a plant. 
Eds. Wang, T.L and Cuming, A. BIOS Scientific Publishers, Oxford, UK, pp. 173-189. 
Toonen, M.A.J., Hendriks, T. f Schmidt, E.DJ~, Verhoeven, H.A., Van Kammen, A and De 
Vries, S.C. (1994) Planta 194, 565-572. 

Toonen, M.AJ., Schmidt, E.D.L., Hendriks, T., Verhoeven, H.A., Van Kammen, A. and De 
Vries, S.C. (1996) submitted. 

Torii, K.U. and Komeda, Y. (1994) 4th international congress of plant molecular biology. 
Abstract 692, 

Van Engelen, F.A. and De Vries, S.G. (1992) Trends Genet. 8, 66-70. 
Vamer, J.E. and Lin, L-S. (1989) Cell 56, 231-239. 
Walker J,C. (1994) Plant Mol. Biol. 26, 1599-1609. 



WO 97/43427 



-102- 



PCT7EP97/02443 



Walker, J.C. 0993) Plant X 3, 45,-456 c. and Baker, B. (1994) Cell 78, 

Whitham, S., Dinesh-Kumar. S.P., Co,, D.. Hehl, B., Corr, o 

2"*?™ X.H- Wa*o, ,C, P., an d Kun, ,0. ,1994, Plan, Molec. Bid. 



26, 791-803. 

Zimmerman, J.L. (1993) Plant Cell 5, 141 1-1423. 



WO 97/43427 



-103- 



PCT/EP97/02443 



What is Claimed is: 

1 . A method of producing apomictic seeds comprising the steps of: 

(i) transforming plant material with a nucleotide sequence encoding a protein the 
presence of which in an active form in a ceil, or membrane thereof, renders said cell 
embryogenic, 

(ii) regenerating the thus transformed material into plants, or carpel-containing parts 
thereof, and 

(iii) expressing the sequence in the vicinity of the embryo sac. 

2. A method according to the preceding claim, wherein the apomictic seeds are of the 
adventitious embryony type. 

3. A method according to either of the preceding claims, wherein expression of the 
sequence yields a protein kinase capable of spanning a plant cell membrane. 

4. A method according to the preceding claim wherein the kinase is capable of 
autophosphorylation. 

5. A method according to any of the preceding claims, wherein the protein is a leucine rich 
repeat receptor like kinase and comprises a ligand binding domain, a proline box, a 
transmembrane domain, a kinase domain and a protein binding domain. 

6. A method according to the preceding claim, wherein the protein lacks a functional ligand 
binding domain but comprises a proline box, a transmembrane domain, a kinase domain 
and a protein binding domain. 

7. A method according to any preceding claim, wherein once incorporated into the cell 
membrane, the protein binding domain is located intra-cellularly. 

8. A method according to any preceding claim, wherein the sequence further encodes a 
cell membrane targeting sequ nee. 
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9. A method according to any preceding claim, wherein the sequence is that depicted in 
SEQ ID Nos. 1 or 2 or is complementary to one which hybridizes under stringent 
conditions with the said sequences and which encodes a membrane bound protein 
having kinase activity. 

10. A method according to any preceding claim, wherein the sequence is modified in that 
known mRNA instability motifs or polyadenylation signals are removed and/or codons 
which are preferred by the plant into which the sequence is to be inserted are used so 
that expression of the thus modified sequence in the said plant yields substantially 
similar protein to that obtained by expression of the unmodified sequence in the 
organism in which the protein is endogenous. 

11. A method according to any preceding clairn t wherein expression of the sequence is 
under control of an inducible or developmental^ regulated promoter. 

12. A method according to the preceding claim, wherein expression of the sequence is 
under control of one of the following: a promoter which regulates expression of SERK 
genes in planta, the carrot chitinase DcEP3-1 gene promoter, the Arabidopsis AtChitIV 
gene promoter, the Arabidopsis LTP-1 gene promoter, the Arabidopsis bel-1 gene 
promoter, the petunia fbp-7 gene promoter, the Arabidopsis ANT gene promoter, the 
promoter of the 0126 gene from Phalaenopsis. 

13. A method according to any of the preceding claims, wherein the sequence is expressed 
in the somatic cells of the embryo sac, ovary wall, nucellus, or integuments. 

14. A method according to any of the preceding claims, wherein the endosperm within the 
apomictic seed results from fusion of polar nuclei within the embryo sac with a pollen- 
derived male gamete nucleus. 

15. A method according to the preceding claim, wherein the sequence encoding the protein 
is expressed prior to fusion of the polar nuclei with the male gamete nucleus. 
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16. DNA comprising a sequence encoding a protein the presence of which in an active form 
in a cell, or membrane thereof, renders said cell embryogenic. 

17. DNA according to claim 16, wherein the protein is a leucine rich repeat receptor like 
kinase and comprises a ligand binding domain, a proline box, a transmembrane domain, 
a kinase domain and a protein binding domain, the iigand binding domain optionally 
being absent or functionally inactive. 

18. DNA according to either of claims 16 or 17 comprising a DNA sequence encoding a N- 
terminal protein fragment having the following amino acid sequence: GhSerTip Asp Pro 
ThrLeuValAsnProCysThrTipPheHs^ 

19. DNA according to claim 18 comprising a DNA sequence encoding a N-terminal protein 
fragment having the following amino acid sequence: ValXaaGhSerTrpAspProThrLeu Val 
Asn Pro Cys WTtp Phe hfis Val Ihr Cys Asn 

with Xaa being a variable amino acid, but preferably LeuorVaL 

20. DNA according to claim 19 comprising a DNA sequence encoding a N-terminal protein 
fragment having the following amino acid sequence: Val Xaa GhSerTrpAspProThrLeuVal 
Asn Pro Cys Thr Trp Phe His Val Thr Cys Asn Xab Xac Xad Xae Val Xaf Arg Val Asp Leu Gly Asn 
Xag Xah Leu Ser Gly hfe Leu Xai Pro Gu Leu Gly Xaj LeuXakXalLeuGh 

with Xaa to Xak being a variable amino acid, but preferably 

Xaa = Leu or Val 

Xab=AsnorGh 

Xac = GkjorAsporHs 

Xad = Asn or Hs 

Xae = Ser or Arg or Gh 

Xaf = leorThr 

Xag=AlaorSer 

Xah-GluorAsn 

Xai=Va)orA!a 

XajsValorLys 

Xak=LysorGLi 



WO 97/43427 



-106- 



PCT/EP97/02443 



Xal=Asnorl-fe 

21. DNA comprising a sequence encoding a protein having the sequence depicted in SEQ 
ID No. 3, or a protein substantially similar thereto which is capable of being membrane 
bound and which has kinase activity. 

22. DNA comprising a sequence encoding a protein having the sequence depicted in SEQ 
ID No. 21 , or a protein substantially similar thereto which is capable of being membrane 
bound and which has kinase activity. 

23. DNA comprising a sequence encoding a protein having the sequence depicted in SEQ 
ID No. 33, or a protein substantially similar thereto which is capable of being membrane 
bound and which has kinase activity. 

24. DNA comprising a sequence encoding a protein having the sequence depicted in SEQ 
ID Nos. 23 f 25, 27, 29 and 31, or a protein substantially similar thereto which is capable 
of being membrane bound and which has kinase activity, 

25. DNA according to any preceding claim, comprising a DNA having the sequence depicted 
in SEQ ID Nos. 1 or 2 or a sequence which is complementary to one which hybridizes 
under stringent conditions with the said sequences and which encodes a membrane 
bound protein having kinase activity. 

26. DNA according to any preceding claim, comprising a DNA having the sequence depicted 
in SEQ ID No: 20 or a sequence which is complementary to one which hybridizes under 
stringent conditions with the said sequences and which encodes a membrane bound 
protein having kinase activity. 

27. DNA according to any preceding claim, comprising a DNA having the sequence depicted 
in SEQ ID No: 32 or a sequence which is complementary to one which hybridizes under 
stringent conditions with the said sequences and which encodes a membrane bound 
protein having kinase activity. 
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28. DNA according to any preceding claim, comprising a DNA having the sequence depicted 
in SEQ ID Nos: 22, 24, 26, 28 and 30 or a sequence which is complementary to one 
which hybridizes under stringent conditions with the said sequences and which encodes 
a membrane bound protein having kinase activity. 

29. DNA according to any of the preceding claims, which further encodes a cell membrane 
targeting sequence. 

30. DNA according to any one of the preceding claims, in which the protein encoding region 
is under expression control of a developmental^ regulated or inducible promoter. 

31 . DNA according to claim 30, wherein the promoter is one of the following: a promoter 
which regulates expression of SERK genes in plants, the carrot chitinase DcEP3-1 gene 
promoter, the Arabidopsis AtChitIV gene promoter, the Arabidopsis LTP-1 gene 
promoter, the Arabidopsis bel-1 gene promoter, the petunia fbp-7 gene promoter, the 
Arabidopsis ANT gene promoter, the promoter of the 0126 gene from Phalaenopsis; the 
Arabidopsis DMC1 promoter, the pTA7001 inducible promoter. 

32. DNA according to any preceding claim, wherein said DNA is a recombinant DNA. 

33. DNA according to any preceding claim, wherein the sequence is modified in that known 
mRNA instability motifs or polyadenylation signals are removed and/or codons which are 
preferred by the plant into which the DNA is to be inserted are used so that expression 
of the thus modified DNA in the said plant yields substantially similar protein to that 
obtained by expression of the unmodified DNA in the organism in which the protein is 
endogenous. 

34. DNA which is complementary to that which hybridizes under stringent conditions with the 
DNA of any one of claims 1 6 to 29. 

35. A vector containing a DNA sequence as claimed in any one of claims 1 6 to 34. 

36. Plant cell transformed with the DNA of any one of claims 16 to 34 or the vector of claim 
35, which contains the DNA stably incoiporated into its genome. 
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37. Plant eel! according to claim 36, which is part of a whole plant. 

38. Plants transformed with the DNA of any one of claims 16 to 34 or the vector of claim 35 t 
the progeny of such plants which contain the DNA stably incorporated, and/or the 
apomictic seeds of such plants or such progeny. 

39. Plants transformed with the DNA comprised by the recombinant DNA of claims 1 6 to 34. 

40. Use of the DNA of any one of claims 16-34 in the manufacture of apomictic seeds. 

41 . Plants which are derived from apomictic seeds obtainable by the method of any one of 
claims 1-1 5 or 40. 

42. A method of obtaining cultivars comprising the steps of fertilizing plants with the pollen of 
the plants of either of claims 38, 39 or 40 and cultivars which result from the said 
method. 

43. A method of obtaining embryogenic cells in plant material, comprising transforming the 
material with a recombinant DNA sequence as claimed in any one of claims 16-34, the 
DNA comprised by the recombinant DNA of any one of claims 16 to 34, or the vector of 
claim 35, expressing the sequence in the material or derivatives thereof and subjecting 
the said material or derivatives to a compound which acts as a ligand for the gene 
product of the said sequence. 

44. A method according to the preceding claim, wherein the sequence encodes a leucine 
rich repeat receptor like kinase, and the compound is a phyto-hormone. 

45. A method of generating somatic embryos under in vitro conditions wherein the SERK 
protein is overexpressed ectopically. 

46. A bag containing apomictic seeds obtainable by the method of any one of claims 1-15 or 
40. 
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